Blog
Democratizing AI Compute Series
Go behind the scenes of the AI industry with Chris Lattner
Latest
Modular 26.2: State-of-the-Art Image Generation and Upgraded AI Coding with Mojo
Today’s 26.2 release expands the Modular Platform’s modality support to include image generation and image editing workflows. This extends our existing support for text and audio generation. In the 26.2 version Black Forest Labs' FLUX.2 model variants are supported with over a 4x speedup over state-of-the-art.
Structured Mojo Kernels Part 2 - The Three Pillars
This post explains the components of Structured Mojo Kernels: TileIO, TilePipeline, and TileOp. Each component forms a node in a kernel execution pipeline, and the links between them create a logical separation of concerns that makes kernels easier to extend and update. That organization matters because GPU kernels don't stay static. By abstracting hardware optimized implementations into patterns, the same kernel structure can adapt across NVIDIA and AMD hardware generations with minimal rewrite.
Modverse #53: Community Builds, Research Milestones, and a Growing Ecosystem
This edition captures everything happening across the Modular ecosystem, from developers building with MAX and Mojo🔥 to the broader impact Modular is having across AI infrastructure. Here's a look at what's been happening lately.
Structured Mojo Kernels Part 1 - Peak Performance, Half the Code
GPU programming has always demanded precision, but the cost of that precision keeps rising. A production matmul kernel written in C++ spans 3,000–5,000 lines of tightly coupled code where a misplaced barrier silently corrupts results. That complexity gatekeeps hardware that should be available to far more developers, and it's a direct product of how GPUs have evolved: with each architecture generation, more of the orchestration burden has shifted onto the programmer.
The Claude C Compiler: What It Reveals About the Future of Software
Compilers occupy a special place in computer science. They're a canonical course in computer science education. Building one is a rite of passage. It forces you to confront how software actually works, by examining languages, abstractions, hardware, and the boundary between human intent and machine execution.
Modular 26.1: A Big Step Towards More Programmable and Portable AI Infrastructure
Today we’re releasing Modular 26.1, a major step toward making high-performance AI computing easier to build, debug, and deploy across heterogeneous hardware. This release is focused squarely on developer velocity and programmability—helping advanced AI teams reduce time to market for their most important innovations.
How to Beat Unsloth's CUDA Kernel Using Mojo—With Zero GPU Experience
Traditional GPU programming has a steep learning curve. The performance gains are massive, but the path to get there (CUDA, PTX, memory hierarchies, occupancy tuning) stops most developers before they start. Mojo aims to flatten that curve: Python-like syntax, systems-level performance, no interop gymnastics, and the same performance gains.
No items found within this category
We couldn’t find anything. Try changing or resetting your filters.
Sign up today
Signup to our Cloud Platform today to get started easily.
Sign UpBrowse open models
Browse our model catalog, or deploy your own custom model
Browse models
-p-500.png)