AI & Technology Updates

  • Benchmarking SLMs on Modest Hardware


    I have been doing some benchmarking of SLM'sBenchmarking of SLMs (Statistical Language Models) was conducted using a modest hardware setup, featuring an Intel N97 CPU, 32GB of DDR4 RAM, and a 512GB NVMe drive, running on Debian with llama.cpp for CPU inference. A test suite of five questions was used, with ChatGPT providing results and comments. The usability score was calculated by raising the test score to the fifth power, multiplying by the average tokens per second, and applying a 10% penalty if the model used reasoning. This penalty is based on the premise that a non-reasoning model performing equally well as a reasoning one is considered more efficient. This matters because it highlights the efficiency and performance considerations in evaluating language models on limited hardware.


  • DeepSeek V3.2: Dense Attention Model


    DeepSeek V3.2 with dense attention (disabled lightning attention) GGUF availableDeepSeek V3.2 with dense attention is now available for use on regular llama.cpp builds without requiring extra support. The model is compatible with Q8_0 and Q4_K_M quantization levels and can be run using a specific jinja template. Performance testing using the lineage-bench on Q4_K_M quant showed impressive results, with the model making only two errors at the most challenging graph size of 128, outperforming the original version with sparse attention. Disabling sparse attention does not seem to negatively impact the model's intelligence, offering a robust alternative for users. This matters because it highlights advancements in model efficiency and usability, allowing for broader application without sacrificing performance.


  • AI Tool for Image-Based Location Reasoning


    Experimenting with image based location reasoning using architectural cuesAn experimental AI tool is being developed to analyze images and suggest real-world locations by detecting architectural and design elements. The tool aims to enhance the interpretability of AI systems by providing explanation-driven reasoning for its location suggestions. Initial tests on a public image with a known location showed promising but imperfect results, highlighting the potential for improvement. This exploration is significant as it could lead to more useful and transparent AI systems in fields like geography, urban planning, and tourism.


  • Artificial Analysis Updates Global Model Indices


    Artificial Analysis just refreshed their global model indicesArtificial Analysis has recently updated their global model indices, potentially to Version 4.0, though this hasn't been officially confirmed. Some users have observed changes in the rankings, such as Kimi K2 being ranked lower than usual, suggesting a possible adjustment in the metrics used. This update appears to favor OpenAI over Google, although not all models have been transitioned to the new benchmark yet. These stealth updates could significantly impact how AI models are evaluated and compared, influencing industry standards and competition.


  • Microsoft Office Not Rebranded to 365 Copilot


    No, Microsoft didn’t rebrand Office to Microsoft 365 CopilotMicrosoft has not rebranded Microsoft Office to Microsoft 365 Copilot, despite recent online confusion. The misunderstanding stems from Microsoft's Office.com domain, which promotes the Microsoft 365 Copilot app as a hub for accessing both Copilot and Office apps. The app, previously known as Office, was rebranded to Microsoft 365 in 2022, and then to Microsoft 365 Copilot in January of the following year. The core Office suite remains part of the Microsoft 365 subscription, and the standalone Office 2024 version is still available, highlighting Microsoft's complex branding strategy. This matters because it clarifies the current branding and product offerings for users navigating Microsoft's ecosystem.