Preview: Tweaked Geek: Practical AI Tech

Four Ways to Run ONNX AI Models on GPU with CUDA

Running ONNX AI models on GPUs with CUDA can be achieved through four distinct methods, enhancing flexibility and performance for machine learning operations. These methods include using ONNX Runtime with CUDA execution provider, leveraging TensorRT for optimized inference, employing PyTorch with its ONNX export capabilities, and utilizing the NVIDIA Triton Inference Server for scalable deployment. Each approach offers unique advantages, such as improved speed, ease of integration, or scalability, catering to different needs in AI model deployment. Understanding these options is crucial for optimizing AI workloads and ensuring efficient use of GPU resources.

Read Full Article

Posted on

Dec 28, 2025

by

TweakedGeek

in

Deep Dives, Tools

Topics: machine learning, AI models, PyTorch

Pydantic AI Durable Agent Demo

Pydantic AI has introduced two new demos showcasing durable agent patterns using DBOS: one demonstrating large fan-out parallel workflows called "Deep Research," and the other illustrating long sequential subagent chaining known as "Twenty Questions." These demos highlight the importance of durable execution, allowing agents to survive crashes or interruptions and resume precisely where they left off. The execution of these workflows is fully observable in the DBOS console, with detailed workflow graphs and management tools, and is instrumented with Logfire to trace token usage and cost per step. This matters because it showcases advanced techniques for building resilient AI systems that can handle complex tasks over extended periods.

Read Full Article

Posted on

Dec 28, 2025

by

NoiseReducer

in

Deep Dives, Tools

Topics: AI systems, workflow management, resilience

Boosting GPU Utilization with WoolyAI’s Software Stack

Traditional GPU job orchestration often leads to underutilization due to the one-job-per-GPU approach, which leaves GPU resources idle when not fully saturated. WoolyAI's software stack addresses this by allowing multiple jobs to run concurrently on a single GPU with deterministic performance, dynamically managing the GPU's streaming multiprocessors (SMs) to ensure full utilization. This approach not only maximizes GPU efficiency but also supports running machine learning jobs on CPU-only infrastructure by executing kernels remotely on a shared GPU pool. Additionally, it allows existing CUDA PyTorch jobs to run seamlessly on AMD hardware without modifications. This matters because it significantly increases GPU utilization and efficiency, potentially reducing costs and improving performance in computational tasks.

Posted on

by

in

Topics: machine learning, GPU efficiency, GPU utilization

Ubisoft Shuts Down ‘Rainbow Six Siege’ Servers After Hack

Ubisoft has temporarily shut down the servers and marketplace for Rainbow Six Siege following a significant security breach. Hackers gained control over critical game functions, including the ability to ban and unban users, send custom messages, unlock all in-game items, and distribute 2 billion R6 Credits and Renown to players. The cash value of these credits is approximately $13.33 million, but Ubisoft has assured players that no penalties will be imposed for using them. However, any transactions made after a specific time will be reversed to prevent exploitation. This matters because it highlights the vulnerabilities in gaming systems and the potential financial implications of such security breaches.

Read Full Article

Posted on

Dec 28, 2025

by

PracticalAI

in

Commentary, News, Security

Topics: Cybersecurity

MayimFlow: Preventing Data Center Water Leaks

MayimFlow, a startup founded by John Khazraee, aims to prevent water leaks in data centers before they occur, using IoT sensors and machine learning models to provide early warnings. Data centers, which consume significant amounts of water, face substantial risks from even minor leaks, potentially leading to costly downtime and disruptions. Khazraee, with a background in infrastructure for major tech companies, has assembled a team experienced in data centers and water management to tackle this challenge. The company envisions expanding its leak detection solutions beyond data centers to other sectors like commercial buildings and hospitals, emphasizing the growing importance of water management in various industries. This matters because proactive leak detection can save companies significant resources and prevent disruptions in critical operations.