AI & Technology Updates

  • GPT-5.2 Router Failure and AI Gaslighting


    GPT-5.2 Router Failure: It confirmed a real event, then switched models and started gaslighting me.An intriguing incident occurred with GPT-5.2 during a query about the Anthony Joshua vs. Jake Paul fight on December 19, 2025. Initially, the AI denied the event, but upon challenge, it switched to a Logic/Thinking model and confirmed Joshua's victory by knockout in the sixth round. However, the system reverted to a faster model, forgetting the confirmation and denying the event again, leading to a frustrating experience where the AI condescendingly dismissed evidence presented by the user. This highlights potential issues with AI model routing and context retention, raising concerns about reliability and user experience in AI interactions.


  • Exploring ML Programming Languages Beyond Python


    PyTorch re-implementations of 50+ ML papers: GANs, VAEs, diffusion, meta-learning, 3D reconstruction, …Python dominates the machine learning landscape due to its extensive libraries and ease of use, making it the go-to language for most practitioners. However, other programming languages like C++, Julia, R, Go, Swift, Kotlin, Java, Rust, Dart, and Vala are also employed for specific performance needs or platform-specific applications. Each language offers unique advantages, such as C++ for performance-critical tasks, R for statistical analysis, and Swift for iOS development, highlighting the importance of choosing the right tool for the job. Understanding multiple languages can enhance a developer's ability to tackle diverse machine learning challenges effectively. Why this matters: A broad understanding of programming languages enhances flexibility and efficiency in developing machine learning solutions tailored to specific performance and platform requirements.


  • Advancements in Local LLMs and AI Hardware


    SOCAMM2 - new(ish), screwable (replaceable, non soldered) LPDDR5X RAM standard intended for AI data centers.Recent advancements in AI technology, particularly within the local LLM landscape, have been marked by the dominance of llama.cpp, a tool favored for its superior performance and flexibility in integrating Llama models. The rise of Mixture of Experts (MoE) models has enabled the operation of large models on consumer hardware, balancing performance with resource efficiency. New local LLMs are emerging with enhanced capabilities, including vision and multimodal functionalities, which are crucial for more complex applications. Additionally, while continuous retraining of LLMs remains difficult, Retrieval-Augmented Generation (RAG) systems are being employed to simulate continuous learning by incorporating external knowledge bases. These developments, alongside significant investments in high-VRAM hardware, are pushing the limits of what can be achieved on consumer-grade machines. Why this matters: These advancements are crucial as they enhance AI capabilities, making powerful tools more accessible and efficient for a wider range of applications, including those on consumer hardware.


  • Exploring Multi-Agent Behavior in Simulations


    If you are interested in studying model/agent psychology/behavior, lmk. I work with a small research team (4 of us atm) and we are working on some strange things :)A small research team is developing simulation engines to study behavior in multi-agent scenarios, with a focus on adversarial concepts, unusual thought experiments, and semi-large scale sociology simulations. The team is open to collaboration and discussion, inviting those interested in model or agent psychology and behavior to engage with them. The work is inspired by thinkers like Amanda Askell from Anthropic, known for her intriguing perspectives on the nature of these models. Understanding agent behavior in complex simulations can provide insights into social dynamics and decision-making processes.


  • RPC-server llama.cpp Benchmarks


    RPC-server llama.cpp benchmarksThe llama.cpp RPC server facilitates distributed inference of large language models (LLMs) by offloading computations to remote instances across multiple machines or GPUs. Benchmarks were conducted on a local gigabit network utilizing three systems and five GPUs, showcasing the server's performance in handling different model sizes and parameters. The systems included a mix of AMD and Intel CPUs, with GPUs such as GTX 1080Ti, Nvidia P102-100, and Radeon RX 7900 GRE, collectively providing a total of 53GB VRAM. Performance tests were conducted on various models, including Nemotron-3-Nano-30B and DeepSeek-R1-Distill-Llama-70B, highlighting the server's capability to efficiently manage complex computations across distributed environments. This matters because it demonstrates the potential for scalable and efficient LLM deployment in distributed computing environments, crucial for advancing AI applications.