Hugging Face

  • AI21 Launches Jamba2 Models for Enterprises


    AI21 releases Jamba2 3B and Jamba2 Mini, built for grounding and instruction followingAI21 has launched Jamba2 3B and Jamba2 Mini, designed to offer enterprises cost-effective models for reliable instruction following and grounded outputs. These models excel in processing long documents without losing context, making them ideal for precise question answering over internal policies and technical manuals. With a hybrid SSM-Transformer architecture and KV cache innovations, they outperform competitors like Ministral3 and Qwen3 in various benchmarks, showcasing superior throughput at extended context lengths. Available through AI21's SaaS and Hugging Face, these models promise enhanced integration into production agent stacks. This matters because it provides businesses with more efficient AI tools for handling complex documentation and internal queries.

    Read Full Article: AI21 Launches Jamba2 Models for Enterprises

  • Liquid AI’s LFM2.5: Compact Models for On-Device AI


    Liquid AI Releases LFM2.5: A Compact AI Model Family For Real On Device AgentsLiquid AI has unveiled LFM2.5, a compact AI model family designed for on-device and edge deployments, based on the LFM2 architecture. The family includes several variants like LFM2.5-1.2B-Base, LFM2.5-1.2B-Instruct, a Japanese optimized model, and vision and audio language models. These models are released as open weights on Hugging Face and are accessible via the LEAP platform. LFM2.5-1.2B-Instruct, the primary text model, demonstrates superior performance on benchmarks such as GPQA and MMLU Pro compared to other 1B class models, while the Japanese variant excels in localized tasks. The vision and audio models are optimized for real-world applications, improving over previous iterations in visual reasoning and audio processing tasks. This matters because it represents a significant advancement in deploying powerful AI models on devices with limited computational resources, enhancing accessibility and efficiency in real-world applications.

    Read Full Article: Liquid AI’s LFM2.5: Compact Models for On-Device AI

  • Nvidia Aims to Be the Android of Robotics


    Nvidia wants to be the Android of generalist roboticsNvidia is positioning itself as the go-to platform for generalist robotics by unveiling a comprehensive ecosystem of robot foundation models, simulation tools, and edge hardware. This initiative aims to make robotics development more accessible and versatile, similar to how Android became the default operating system for smartphones. Key components of Nvidia's strategy include open foundation models like Cosmos Transfer 2.5 and Cosmos Reason 2, which enable robots to reason and act across diverse tasks, and the Isaac Lab-Arena, an open-source simulation framework for safe virtual testing. The company is also deepening its partnership with Hugging Face to integrate its technologies and broaden access to robot training. Nvidia's approach is already gaining traction, with its models leading downloads on Hugging Face and adoption by major robotics companies. This matters because Nvidia's efforts could democratize robotics development, making it more accessible and driving innovation across industries.

    Read Full Article: Nvidia Aims to Be the Android of Robotics

  • Qwen-Image-2512: Strongest Open-Source Model Released


    Qwen released Qwen-Image-2512 on Hugging face. Qwen-Image-2512 is currently the strongest open-source model.Qwen-Image-2512, the latest release on Hugging Face, is currently the strongest open-source image model available. It offers significant improvements in rendering more realistic human features, enhancing natural textures, and providing stronger text-image compositions. Tested rigorously in over 10,000 blind rounds on AI Arena, it outperforms other open-source models and remains competitive with proprietary systems. This advancement matters as it enhances the quality and accessibility of open-source image generation technology, potentially benefiting a wide range of applications from digital art to automated content creation.

    Read Full Article: Qwen-Image-2512: Strongest Open-Source Model Released

  • Meta’s RPG Dataset on Hugging Face


    Meta released RPG, a research plan generation dataset on Hugging FaceMeta has introduced RPG, a comprehensive dataset aimed at advancing AI research capabilities, now available on Hugging Face. This dataset includes 22,000 tasks derived from fields such as machine learning, Arxiv, and PubMed, and is equipped with evaluation rubrics and Llama-4 reference solutions. The initiative is designed to support the development of AI co-scientists, enhancing their ability to generate research plans and contribute to scientific discovery. By providing structured tasks and solutions, RPG aims to facilitate AI's role in scientific research, potentially accelerating innovation and breakthroughs.

    Read Full Article: Meta’s RPG Dataset on Hugging Face

  • Run MiniMax-M2.1 Locally with Claude Code & vLLM


    Running MiniMax-M2.1 Locally with Claude Code and vLLM on Dual RTX Pro 6000Running the MiniMax-M2.1 model locally using Claude Code and vLLM involves setting up a robust hardware environment, including dual NVIDIA RTX Pro 6000 GPUs and an AMD Ryzen 9 7950X3D processor. The process requires installing vLLM nightly on Ubuntu 24.04 and downloading the AWQ-quantized MiniMax-M2.1 model from Hugging Face. Once the server is set up with Anthropic-compatible endpoints, Claude Code can be configured to interact with the local model using a settings.json file. This setup allows for efficient local execution of AI models, reducing reliance on external cloud services and enhancing data privacy.

    Read Full Article: Run MiniMax-M2.1 Locally with Claude Code & vLLM

  • Hosting Language Models on a Budget


    Hosting Language Models on a BudgetRunning your own large language model (LLM) can be surprisingly affordable and straightforward, with options like deploying TinyLlama on Hugging Face for free. Understanding the costs involved, such as compute, storage, and bandwidth, is crucial, as compute is typically the largest expense. For beginners or those with limited budgets, free hosting options like Hugging Face Spaces, Render, and Railway can be utilized effectively. Models like TinyLlama, DistilGPT-2, Phi-2, and Flan-T5-Small are suitable for various tasks and can be run on free tiers, providing a practical way to experiment and learn without significant financial investment. This matters because it democratizes access to advanced AI technology, enabling more people to experiment and innovate without prohibitive costs.

    Read Full Article: Hosting Language Models on a Budget

  • Pretraining Llama Model on Local GPU


    Pretraining a Llama Model on Your Local GPUPretraining a Llama model on a local GPU involves setting up a comprehensive pipeline using PyTorch and Hugging Face libraries. The process starts with loading a tokenizer and a dataset, followed by defining the model architecture through a series of classes, such as LlamaConfig, RotaryPositionEncoding, and LlamaAttention, among others. The Llama model is built using transformer layers with rotary position embeddings and grouped-query attention mechanisms. The training setup includes defining hyperparameters like learning rate, batch size, and sequence length, along with creating data loaders, optimizers, and learning rate schedulers. The training loop involves computing attention masks, applying the model to input data, calculating loss using cross-entropy, and updating model weights with gradient clipping. Checkpoints are saved periodically to resume training if interrupted, and the final model is saved upon completion. This matters because it provides a detailed guide for developers to pretrain large language models efficiently on local hardware, making advanced AI capabilities more accessible.

    Read Full Article: Pretraining Llama Model on Local GPU