How-Tos

  • Guide to Deploying ML Models on Edge Devices


    Finally released my guide on deploying ML to Edge Devices: "Ultimate ONNX for Deep Learning Optimization""Ultimate ONNX for Deep Learning Optimization" is a comprehensive guide aimed at ML Engineers and Embedded Developers, focusing on deploying machine learning models to resource-constrained edge devices. The book addresses the challenges of moving models from research to production, offering a detailed workflow from model export to deployment. It covers ONNX fundamentals, optimization techniques such as quantization and pruning, and practical tools like ONNX Runtime. Real-world case studies are included, demonstrating the deployment of models like YOLOv12 and Whisper on devices like the Raspberry Pi. This guide is essential for those looking to optimize deep learning models for speed and efficiency without compromising accuracy. This matters because effectively deploying machine learning models on edge devices can significantly enhance the performance and applicability of AI in real-world scenarios.

    Read Full Article: Guide to Deploying ML Models on Edge Devices

  • GraphQLite: Embedded Graph Database with SQLite


    GraphQLite - Embedded graph database for building GraphRAG with SQLiteGraphQLite is an SQLite extension designed for those building GraphRAG systems who prefer not to use Neo4j for storing knowledge graphs. It introduces Cypher query support, allowing users to store entities and relationships in a graph structure and utilize Cypher for context expansion during data retrieval. By integrating with sqlite-vec for vector search, GraphQLite provides a comprehensive embedded RAG stack within a single database file. It also includes graph algorithms like PageRank and community detection, which help identify key entities and cluster related concepts. This extension is particularly useful for developers looking for a streamlined solution to manage graph data efficiently. This matters because it offers a lightweight, integrated alternative for handling complex graph data without the overhead of additional database systems.

    Read Full Article: GraphQLite: Embedded Graph Database with SQLite

  • AIfred Intelligence: Self-Hosted AI Assistant


    I built AIfred-Intelligence - a self-hosted AI assistant with automatic web research and multi-agent debates (AIfred with upper "i" instead of lower "L" :-)AIfred Intelligence is a self-hosted AI assistant designed to enhance user interaction with advanced features like automatic web research and multi-agent debates. It autonomously conducts web searches, scrapes sources, and cites them without manual input, while engaging in debates through three AI personas: AIfred the scholar, Sokrates the critic, and Salomo the judge. Users can customize system prompts and choose from various discussion modes, ensuring dynamic and contextually rich conversations. The platform supports multiple functionalities, including vision/OCR tools, voice interfaces, and internationalization, all running locally with extensive customization options for large language models. This matters because it demonstrates the potential of AI to autonomously perform complex tasks and facilitate nuanced discussions, enhancing productivity and decision-making.

    Read Full Article: AIfred Intelligence: Self-Hosted AI Assistant

  • Optimizing 6700XT GPU with ROCm and Openweb UI


    For those with a 6700XT GPU (gfx1031) - ROCM - Openweb UIFor those using a 6700XT GPU and looking to optimize their setup with ROCm and Openweb UI, a custom configuration has been shared that leverages Google Studio AI for system building. The setup requires Python 3.12.x for ROCm, with Text Generation using ROCm 7.1.1 and Imagery ROCBlas utilizing version 6.4.2. The system is configured to automatically start services on boot with batch files, running them in the background for easy access via Openweb UI. This approach avoids Docker to conserve resources and achieves a performance of 22-25 t/s on ministral3-14b-instruct Q5_XL with a 16k context, with additional success in running Stablediffusion.cpp using a similar custom build. Sharing this configuration could assist others in achieving similar performance gains. This matters because it provides a practical guide for optimizing GPU setups for specific tasks, potentially improving performance and efficiency for users with similar hardware.

    Read Full Article: Optimizing 6700XT GPU with ROCm and Openweb UI

  • Transcribe: Local Audio Transcription with Whisper


    Transcribe: local Whisper transcription (GUI + CLI) with diarization, timestamps, optional OllamaTranscribe (tx) is a free desktop and CLI tool designed for local audio transcription using Whisper, capable of capturing audio from files, microphones, or system audio to produce timestamped transcripts with speaker diarization. It offers multiple modes, including file mode for WAV file transcription, mic mode for live microphone capture, and speaker mode for capturing system audio with optional microphone input. The tool is offline-friendly, running locally after the initial model download, and supports optional summaries via Ollama models. It is cross-platform, working on Windows, macOS, and Linux, and is automation-friendly with CLI support for batch processing and repeatable workflows. This matters as it provides a versatile, privacy-focused solution for audio transcription and analysis without relying on cloud services.

    Read Full Article: Transcribe: Local Audio Transcription with Whisper

  • Comprehensive AI/ML Learning Roadmap


    Sharing This Complete AI/ML RoadmapA comprehensive AI/ML learning roadmap has been developed to guide learners from beginner to advanced levels using only free resources. This structured path addresses common issues with existing roadmaps, such as being too shallow, overly theoretical, outdated, or fragmented. It begins with foundational knowledge in Python and math, then progresses through core machine learning, deep learning, LLMs, NLP, generative AI, and agentic systems, with each phase including practical projects to reinforce learning. The roadmap is open for feedback to ensure it remains a valuable and accurate tool for anyone serious about learning AI/ML without incurring costs. This matters because it democratizes access to quality AI/ML education, enabling more individuals to develop skills in this rapidly growing field.

    Read Full Article: Comprehensive AI/ML Learning Roadmap

  • Infer: A CLI Tool for Piping into LLMs


    made a simple CLI tool to pipe anything into an LLM. that follows unix philosophy.Infer is a newly developed command-line interface tool that allows users to pipe command outputs directly into a large language model (LLM) for analysis, similar to how grep is used for text searching. By integrating with OpenAI-compatible APIs, users can ask questions about their command outputs, such as identifying processes consuming RAM or checking for hardware errors, without manually copying and pasting logs. The tool is lightweight, consisting of less than 200 lines of C code, and outputs plain text, making it a practical solution for debugging and command recall. This innovation simplifies the interaction with LLMs, enhancing productivity and efficiency in managing command-line tasks.

    Read Full Article: Infer: A CLI Tool for Piping into LLMs

  • HuggingFace Model Downloader v2.3.0: Web UI & Faster Scanning


    🚀 HuggingFace Model Downloader v2.3.0 - Now with Web UI, Live Progress, and 100x Faster Scanning!The HuggingFace Model Downloader v2.3.0 introduces significant improvements for users downloading models and datasets, including a new web UI that allows for easy management of downloads through a browser. This version supports concurrent connections, smart resume capabilities, and filtering options to download specific quantizations. Notably, it features a one-liner web mode for quick setup and a dramatic increase in repository scanning speed, reducing the time from over five minutes to approximately two seconds. These enhancements make the tool more efficient and user-friendly, particularly for those dealing with large repositories. Why this matters: The updates significantly streamline the process of downloading and managing machine learning models, saving time and simplifying tasks for developers and researchers.

    Read Full Article: HuggingFace Model Downloader v2.3.0: Web UI & Faster Scanning

  • AI Streamlines Blogging Workflows in 2026


    Using AI to Streamline Blogging Workflows in 2026Advancements in AI technology have significantly enhanced the efficiency of blogging workflows by automating various aspects of content creation. AI tools are now capable of generating outlines and content drafts, optimizing posts for search engines, suggesting keywords and internal linking opportunities, and tracking performance to improve content quality. These innovations allow bloggers to focus more on creativity and strategy while AI handles the technical and repetitive tasks. This matters because it demonstrates how AI can transform content creation, making it more accessible and efficient for creators.

    Read Full Article: AI Streamlines Blogging Workflows in 2026

  • Physician’s 48-Hour NLP Journey in Healthcare AI


    [P] Physician → NLP in 48 hours: Building a clinical signal extraction pipeline during my December breakA psychiatrist with an engineering background embarked on a journey to learn natural language processing (NLP) and develop a clinical signal extraction tool for C-SSRS/PHQ-9 assessments within 48 hours. Despite initial struggles with understanding machine learning concepts and tools, the physician successfully created a working prototype using rule-based methods and OpenAI API integration. The project highlighted the challenges of applying AI in healthcare, particularly due to the subjective and context-dependent nature of clinical tools like PHQ-9 and C-SSRS. This experience underscores the need for a bridge between clinical expertise and technical development to enhance healthcare AI applications. Understanding and addressing these challenges is crucial for advancing AI's role in healthcare.

    Read Full Article: Physician’s 48-Hour NLP Journey in Healthcare AI