TOPAS-DSPL: Dual-Stream Transformer for Reasoning

[P] TOPAS-DSPL: A 15M param Dual-Stream Recursive Transformer achieving 24% on ARC-2

TOPAS-DSPL is a neuro-symbolic model that utilizes a dual-stream recursive transformer architecture to enhance small-scale reasoning tasks. By employing a “Bicameral” latent space, it separates algorithmic planning from execution state, which reduces “Compositional Drift” compared to traditional monolithic models. With a parameter count of approximately 15 million, it achieves a 24% accuracy on the ARC-AGI-2 Evaluation Set, showing a significant improvement over standard Tiny Recursive Models. The model’s architecture addresses the “forgetting” problem in recursive loops by decoupling rule generation from state updates, and the open-sourcing of its training pipeline allows for independent verification and further development. This matters as it demonstrates significant advancements in reasoning models, making them more accessible and effective for complex problem-solving tasks.

TOPAS-DSPL is an innovative neuro-symbolic model that introduces a novel approach to reasoning tasks by utilizing a dual-stream recursive transformer architecture. This model, with its modest 15 million parameters, aims to enhance small-scale reasoning models by implementing “Bicameral” latent spaces. The separation of algorithmic planning from execution state through Dynamic AdaLN conditioning is a key feature, reducing “Compositional Drift” often seen in monolithic recursive models. This separation allows for more precise control and execution, potentially leading to more accurate outcomes in reasoning tasks.

The significance of achieving a 24% exact match accuracy on the ARC-AGI-2 evaluation set cannot be understated, especially when compared to the 8% accuracy of standard Tiny Recursive Models. This represents a threefold improvement, showcasing the potential of the dual-stream approach in enhancing reasoning capabilities. Such advancements are crucial as they indicate progress towards more sophisticated and reliable AI systems capable of handling complex reasoning tasks, which are essential for applications in various fields such as robotics, natural language processing, and cognitive computing.

TOPAS-DSPL addresses the “forgetting” problem commonly associated with recursive loops by decoupling rule generation from state updates. This is achieved through the Logic Stream acting as a controller, modulating the Canvas Stream’s weights at each timestep. The use of Test-Time Training (TTT) for instance-specific adaptation and MuonClip for optimization stability further enhances the model’s robustness and adaptability. These methodological innovations contribute to the model’s improved performance and highlight the importance of architectural advancements in overcoming limitations of existing models.

The open-sourcing of TOPAS-DSPL’s full training pipeline, data augmentation scripts, and evaluation harness is a significant step towards transparency and reproducibility in AI research. By allowing independent verification of results, it fosters collaboration and innovation within the community. The excitement from Bitterbot AI about this release suggests that even more advanced models are in development, hinting at future breakthroughs. This openness not only accelerates the pace of research but also democratizes access to cutting-edge technology, enabling a wider range of researchers and developers to contribute to and benefit from these advancements.

Read the original article here

Comments

2 responses to “TOPAS-DSPL: Dual-Stream Transformer for Reasoning”

  1. FilteredForSignal Avatar
    FilteredForSignal

    The introduction of a dual-stream architecture in the TOPAS-DSPL model for separating planning from execution is fascinating, particularly as a means to address “Compositional Drift” and the “forgetting” problem. Given the model’s significant improvement on the ARC-AGI-2 Evaluation Set, what are the key challenges you foresee in scaling this dual-stream approach to larger datasets or more complex reasoning tasks?

    1. NoHypeTech Avatar
      NoHypeTech

      Scaling the dual-stream approach in TOPAS-DSPL to larger datasets or more complex tasks could present challenges such as increased computational demands and the need for more sophisticated mechanisms to manage the expanded latent space. Additionally, ensuring effective generalization without losing the model’s ability to address “Compositional Drift” and the “forgetting” problem could be complex. For more detailed insights, please refer to the original article linked in the post.