Deep Dives

  • HOPE Replica Achieves Negative Forgetting on SplitMNIST


    my HOPE Replica(from Nested Learning) achieved negative forgetting on SplitMNIST(Task IL)A HOPE replica, inspired by the paper "Nested Learning: The Illusion of Deep Learning Architecture," has achieved negative forgetting on the SplitMNIST task, which is a significant accomplishment in task incremental learning (Task IL). Negative forgetting, also known as positive transfer, implies that the model not only retains previously learned tasks but also improves on them while learning new tasks. This achievement highlights the potential for developing more efficient deep learning models that can better manage and utilize knowledge across multiple tasks. Understanding and implementing such models can lead to advancements in AI that are more adaptable and capable of continuous learning.

    Read Full Article: HOPE Replica Achieves Negative Forgetting on SplitMNIST

  • Dynamic Learning Rate Scheduling


    Learning Rate Scheduling: Dynamic Training StrategiesTraining a machine learning model often requires adjusting the learning rate as the process progresses. Initially, a larger learning rate is beneficial for rapid progress, but as the model nears optimal performance, a smaller learning rate is necessary for fine-tuning and precise adjustments. Without adapting the learning rate, the model may overshoot the optimal point, causing oscillations and preventing further improvement. Implementing a learning rate schedule can significantly enhance model performance, potentially increasing accuracy from 85 percent to 95 percent with the same model and data. This matters because it can lead to more efficient training and better-performing models in machine learning applications.

    Read Full Article: Dynamic Learning Rate Scheduling

  • Cogitator: Open-Source AI Runtime in TypeScript


    I (almost) built an open-source, self-hosted runtime for AI agents in TypeScript...Cogitator is an open-source, self-hosted runtime designed to orchestrate AI agents and LLM swarms, built with TypeScript to offer type safety and seamless web integration. It provides a universal LLM interface that supports multiple AI platforms like Ollama, vLLM, OpenAI, Anthropic, and Google through a single API. The system is equipped with a DAG-based workflow engine, multi-agent swarm strategies, and sandboxed execution using Docker/WASM for secure operations. With a focus on production readiness, it utilizes Redis and Postgres for memory management and offers full observability features like OpenTelemetry and cost tracking. This matters because it aims to provide a more stable and efficient alternative to existing AI infrastructures with significantly fewer dependencies.

    Read Full Article: Cogitator: Open-Source AI Runtime in TypeScript

  • EdgeVec v0.7.0: Browser-Based Vector Search


    EdgeVec v0.7.0: Run Vector Search in Your Browser — 32x Memory Reduction + SIMD AccelerationEdgeVec v0.7.0 is a browser-based vector database designed to provide local AI applications with cloud-like vector search capabilities without network dependency. It introduces significant updates such as binary quantization for a 32x memory reduction, SIMD acceleration for up to 8.75x faster processing, and IndexedDB persistence for data retention across sessions. These features enable efficient local document search, offline retrieval-augmented generation (RAG), and privacy-preserving AI assistants by allowing data to remain entirely on the user's device. This matters because it empowers users to perform advanced searches and AI tasks locally, maintaining privacy and reducing reliance on cloud services.

    Read Full Article: EdgeVec v0.7.0: Browser-Based Vector Search

  • TOPAS-DSPL: Dual-Stream Transformer for Reasoning


    [P] TOPAS-DSPL: A 15M param Dual-Stream Recursive Transformer achieving 24% on ARC-2TOPAS-DSPL is a neuro-symbolic model that utilizes a dual-stream recursive transformer architecture to enhance small-scale reasoning tasks. By employing a "Bicameral" latent space, it separates algorithmic planning from execution state, which reduces "Compositional Drift" compared to traditional monolithic models. With a parameter count of approximately 15 million, it achieves a 24% accuracy on the ARC-AGI-2 Evaluation Set, showing a significant improvement over standard Tiny Recursive Models. The model's architecture addresses the "forgetting" problem in recursive loops by decoupling rule generation from state updates, and the open-sourcing of its training pipeline allows for independent verification and further development. This matters as it demonstrates significant advancements in reasoning models, making them more accessible and effective for complex problem-solving tasks.

    Read Full Article: TOPAS-DSPL: Dual-Stream Transformer for Reasoning

  • 15M Param Model Achieves 24% on ARC-AGI-2


    15M param model solving 24% of ARC-AGI-2 (Hard Eval). Runs on consumer hardware.Bitterbot AI has introduced TOPAS-DSPL, a compact recursive model with approximately 15 million parameters, achieving 24% accuracy on the ARC-AGI-2 evaluation set, a significant improvement over the previous state-of-the-art (SOTA) of 8% for models of similar size. The model employs a "Bicameral" architecture, dividing tasks into a Logic Stream for algorithm planning and a Canvas Stream for execution, effectively addressing compositional drift issues found in standard transformers. Additionally, Test-Time Training (TTT) is used to fine-tune the model on specific examples before solution generation. The entire pipeline, including data generation, training, and evaluation, has been open-sourced, allowing for community verification and potential reproduction of results on consumer hardware like the 4090 GPU. This matters because it demonstrates significant advancements in model efficiency and accuracy, making sophisticated AI more accessible and verifiable.

    Read Full Article: 15M Param Model Achieves 24% on ARC-AGI-2

  • The State Of LLMs 2025: Progress, Problems, Predictions


    [P] The State Of LLMs 2025: Progress, Problems, and PredictionsChoosing the right machine learning framework is crucial for development efficiency and model performance. PyTorch and TensorFlow are two of the most recommended frameworks, with TensorFlow being favored in industrial settings due to its robust tools and Keras integration, which simplifies development. However, some users find TensorFlow setup challenging, particularly on Windows due to the lack of native GPU support. Other notable frameworks include JAX, Scikit-Learn, and XGBoost, with various subreddits offering platforms for further discussion and personalized advice from experienced practitioners. This matters because selecting an appropriate machine learning framework can significantly influence the success and efficiency of AI projects.

    Read Full Article: The State Of LLMs 2025: Progress, Problems, Predictions

  • Alibaba’s MAI-UI: Leading GUI Agent Innovation


    Alibaba Tongyi Lab Releases MAI-UI: A Foundation GUI Agent Family that Surpasses Gemini 2.5 Pro, Seed1.8 and UI-Tars-2 on AndroidWorldAlibaba Tongyi Lab's MAI-UI is a groundbreaking family of GUI agents that excels in mobile GUI navigation and grounding, outperforming previous models like Gemini 2.5 Pro and Seed1.8. By integrating MCP tool use, agent-user interaction, and device-cloud collaboration, MAI-UI addresses gaps in earlier GUI agents, maintaining privacy while leveraging cloud models. Built on the Qwen3 VL framework, these agents process natural language and UI screenshots to perform actions in Android environments, achieving high accuracy on benchmarks such as ScreenSpot Pro and MMBench GUI L2. The system's robust navigation capabilities are enhanced through a self-evolving data pipeline and an online reinforcement learning framework, demonstrating significant improvements in success rates on the AndroidWorld benchmark. This matters because it represents a significant advancement in the development of intelligent, interactive mobile applications that can seamlessly integrate with user needs and complex environments.

    Read Full Article: Alibaba’s MAI-UI: Leading GUI Agent Innovation

  • New SSM Architecture Exceeds Transformer Baseline


    [R] New SSM architecture (exceeds Transformer baseline) - reproducible benchmarks (feedback wanted)Recent advancements in sequence modeling have introduced a new State Space Model (SSM) architecture that surpasses traditional Transformers by addressing their O(L^2) complexity limitation for long sequences. By integrating delta-rule updates with the powerful representational capabilities of gated convolutions, this new architecture achieves O(n) complexity, making it a strong baseline for sequence modeling tasks. The architecture not only matches but exceeds the performance and speed of Transformers, even with relatively short sequence lengths, thanks to the use of mildly optimized Triton kernels. This development is significant as it provides a more efficient and scalable solution for processing long sequences in natural language processing and other domains.

    Read Full Article: New SSM Architecture Exceeds Transformer Baseline

  • Dropout: Regularization Through Randomness


    Dropout: Regularization Through RandomnessNeural networks often suffer from overfitting, where they memorize training data instead of learning generalizable patterns, especially as they become deeper and more complex. Traditional regularization methods like L2 regularization and early stopping can fall short in addressing this issue. In 2012, Geoffrey Hinton and his team introduced dropout, a novel technique where neurons are randomly deactivated during training, preventing any single pathway from dominating the learning process. This approach not only limits overfitting but also encourages the development of distributed and resilient representations, making dropout a pivotal method in enhancing the robustness and adaptability of deep learning models. Why this matters: Dropout is crucial for improving the generalization and performance of deep neural networks, which are foundational to many modern AI applications.

    Read Full Article: Dropout: Regularization Through Randomness