AI & Technology Updates

  • Four Ways to Run ONNX AI Models on GPU with CUDA


    Not One, Not Two, Not Even Three, but Four Ways to Run an ONNX AI Model on GPU with CUDARunning ONNX AI models on GPUs with CUDA can be achieved through four distinct methods, enhancing flexibility and performance for machine learning operations. These methods include using ONNX Runtime with CUDA execution provider, leveraging TensorRT for optimized inference, employing PyTorch with its ONNX export capabilities, and utilizing the NVIDIA Triton Inference Server for scalable deployment. Each approach offers unique advantages, such as improved speed, ease of integration, or scalability, catering to different needs in AI model deployment. Understanding these options is crucial for optimizing AI workloads and ensuring efficient use of GPU resources.


  • Pydantic AI Durable Agent Demo


    Pydantic AI Durable Agent DemoPydantic AI has introduced two new demos showcasing durable agent patterns using DBOS: one demonstrating large fan-out parallel workflows called "Deep Research," and the other illustrating long sequential subagent chaining known as "Twenty Questions." These demos highlight the importance of durable execution, allowing agents to survive crashes or interruptions and resume precisely where they left off. The execution of these workflows is fully observable in the DBOS console, with detailed workflow graphs and management tools, and is instrumented with Logfire to trace token usage and cost per step. This matters because it showcases advanced techniques for building resilient AI systems that can handle complex tasks over extended periods.


  • Boosting GPU Utilization with WoolyAI’s Software Stack


    Co-locating multiple jobs on GPUs with deterministic performance for a 2-3x increase in GPU UtilizationTraditional GPU job orchestration often leads to underutilization due to the one-job-per-GPU approach, which leaves GPU resources idle when not fully saturated. WoolyAI's software stack addresses this by allowing multiple jobs to run concurrently on a single GPU with deterministic performance, dynamically managing the GPU's streaming multiprocessors (SMs) to ensure full utilization. This approach not only maximizes GPU efficiency but also supports running machine learning jobs on CPU-only infrastructure by executing kernels remotely on a shared GPU pool. Additionally, it allows existing CUDA PyTorch jobs to run seamlessly on AMD hardware without modifications. This matters because it significantly increases GPU utilization and efficiency, potentially reducing costs and improving performance in computational tasks.


  • Ubisoft Shuts Down ‘Rainbow Six Siege’ Servers After Hack


    Ubisoft shuts down ‘Rainbow Six Siege’ servers following hackUbisoft has temporarily shut down the servers and marketplace for Rainbow Six Siege following a significant security breach. Hackers gained control over critical game functions, including the ability to ban and unban users, send custom messages, unlock all in-game items, and distribute 2 billion R6 Credits and Renown to players. The cash value of these credits is approximately $13.33 million, but Ubisoft has assured players that no penalties will be imposed for using them. However, any transactions made after a specific time will be reversed to prevent exploitation. This matters because it highlights the vulnerabilities in gaming systems and the potential financial implications of such security breaches.


  • MayimFlow: Preventing Data Center Water Leaks


    MayimFlow wants to stop data center leaks before they happenMayimFlow, a startup founded by John Khazraee, aims to prevent water leaks in data centers before they occur, using IoT sensors and machine learning models to provide early warnings. Data centers, which consume significant amounts of water, face substantial risks from even minor leaks, potentially leading to costly downtime and disruptions. Khazraee, with a background in infrastructure for major tech companies, has assembled a team experienced in data centers and water management to tackle this challenge. The company envisions expanding its leak detection solutions beyond data centers to other sectors like commercial buildings and hospitals, emphasizing the growing importance of water management in various industries. This matters because proactive leak detection can save companies significant resources and prevent disruptions in critical operations.