One of the things I value most about working at Oxmiq is the opportunity to learn from Raja Koduri. Recently, he sat down with Salah Nasri on the Semiconductor Leadership Podcast, and the talk felt like a masterclass on the business strategy behind Oxmiq Labs.
The problem — AI infrastructure is hitting a wall
Raja opened by framing the scale of this challenge. A single hyperscale AI data center now costs approximately $50 billion. The silicon budget alone for buying the most popular GPU is approximately $30–35 billion per gigawatt, and the energy, power delivery, networking, and cooling adds another $10–15 billion.
As AI demand accelerates, these costs will also rise — which means the industry can't build fast enough to meet demand. Raja emphasized that utilization — getting more useful work out of every transistor — will determine the winners and losers over the next five years.
OxCore — a unified architecture
At the center of our platform is OxCore. Raja describes it as "the new computing core." It encapsulates scalar, vector, and matrix computing into one unified architecture. Today, most systems treat these as separate abstractions: CPU-style processing (scalar), GPU-style parallel processing (vectors), and tensor/TPU-style matrix math. OxCore unifies all three.
"You can think of it as a single core that encapsulates CPU, GPU, and TPU into one unified architecture. This design philosophy improves utilization and simplifies how developers think about execution."
— Raja Koduri
Chiplet quilting and the need for real standards
Raja mentioned that chiplets are often discussed as the future of scalable silicon, but as he put it: "Everyone who is super excited about chiplets are the ones who haven't done them yet." He's earned invaluable experience driving chiplet integration from AMD's HBM1 all the way through Intel's 47-chiplet Data Center GPU Ponte Vecchio.
"You doing a chiplet and me doing a chiplet and expecting them to come together and work, if you just have a standard? No, no, no. Even within my own team at Intel, there were challenges."
— Raja Koduri
Our chiplet quilting approach recognizes that plug-and-play integration requires more than innovation. It demands standardization, disciplined execution, and continual validation.
From agents to atoms
This part of the conversation stood out most. Raja called it "probably the most visionary or most profound thing" in the entire discussion.
The idea: there are layers of abstraction (programming languages, frameworks, drivers, runtimes) that sit between an AI agent generating work and the silicon executing it. Those layers are built by humans, for humans. But agents aren't humans.
"They don't need to talk through Python, C, all these intermediate languages. We created them for humans to program. But when it's an agent generating work, there will be new, more efficient forms of communication, where the agent can talk to what I call nano-agents in silicon directly."
— Raja Koduri
"You can express an entire inference model in a single page of math equations. Why am I breaking that down into tens of thousands of lines of code? What if the future hardware just talks math?"
— Raja Koduri
Why does OXMIQ license GPU IP?
Unlike traditional fabless companies that design and sell their own chips, OXMIQ provides licensable IP so others can build chips tailored to their specific needs.
"There is ARM for CPUs. But there is not ARM for GPUs. Anyone can license our IP and build a chip. That's the problem we're trying to solve."
— Raja Koduri
OxCapsule beta: reducing developer friction
We launched the public beta in November 2025 with V1.0 for Windows and Mac, and released V2.2 with Linux client support in December 2025. We're currently releasing updates monthly.
So far, the beta has attracted participation from ARM, AMD, Intel, Infineon, Global Foundries, Tenstorrent, Radisys, and universities including Boston University, NYU, Texas A&M, IIT Hyderabad, and University of Utah.