◆ CUDA compatibility & full-stack execution

The world's AI software, running unchanged.

The entire AI ecosystem runs on NVIDIA's CUDA. OxPython lets that software run, as written, on other hardware. We proved it on hardware we did not build. On OXMIQ's own OxCore processor, OxPython goes further, optimizing and running the model directly on the core.

Runs unmodifiedPyTorchvLLMHugging Face

inference.py

import torch
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("your-model")
model = model.to("cuda")        # written for NVIDIA

output = model.generate(prompt)   # it just runs

RUNNING ON Tenstorrent no code changes

The code does not change. What it runs on does.

Verified targets: Tenstorrent · OxCore (FPGA)

01The challenge, and the opportunity in it

Nearly every AI model in the world is built and tuned to run on NVIDIA hardware, using NVIDIA's software platform, CUDA. Over nearly two decades, the whole industry has been built on top of it.

Conventionally that is framed as a trap. Even when better, cheaper, or more available hardware arrives, customers stay stuck. Their software only knows how to talk to NVIDIA, and moving it means a rewrite that can take years. So the better hardware never gets a fair hearing.

This is the wall every alternative-silicon company runs into. Most try to climb it one of two slow ways: rebuild the whole software ecosystem from scratch, or chase NVIDIA's platform feature by feature, always a step behind.

And the gravity is not going away. With the vast majority of the installed base on NVIDIA, that is where the AI ecosystem's innovation keeps landing first. New models, new techniques, new libraries are built and tuned for CUDA before anything else, and that momentum is likely to hold for years.

So choosing to target CUDA-style compute is not settling for the past. It is aligning with where the ecosystem will keep moving. A team picking hardware today for a product that ships a couple of years out is, realistically, picking into a CUDA-shaped world.

The same gravity that looks like lock-in is also a vast, durable foundation of software, skills, and momentum. The opportunity is to stop fighting it and start using it: take everything the world has already built for CUDA and set it free to run on better hardware. That is the opportunity OXMIQ was built to capture.

02What OxPython is

More than a compatibility layer.
A path that climbs.

OxPython meets AI software exactly where it is. To a program, it looks and behaves like the NVIDIA GPU the program expects, while underneath it routes the real work to whatever hardware is actually there. Compatibility is the on-ramp. It is also only the first step.

Heterogeneous runtime: model.py (PyTorch, JAX) runs through the OxPython runtime — same code, any backend — executing unmodified on NVIDIA, AMD, Intel and custom silicon.

03The hierarchy of goodness

Gets Better with Each Step.

01CUDA compatibility

Run the world's AI, unchanged.

AI built for NVIDIA runs as written, with no porting and no rewrite. The industry's investment in the CUDA ecosystem carries forward instead of being thrown away. This is the on-ramp, and the proof.

WhereTenstorrent hardware

Your AI app speaks CUDA. OxPython presents the NVIDIA interface it expects and routes the real work to Tenstorrent silicon. The app runs unchanged.

Proof: running today on Tenstorrent silicon, no code changes

OxPython translating CUDA on Tenstorrent — your model.py flows through OxPython to Tenstorrent AI accelerator, no code changes

Demos · running on Tenstorrent

Text GenerationLlama-3.2-1B InstructWatch ▶ Mathematical ReasoningDeepSeek R1 DistillWatch ▶ Image ClassificationEfficientNet B0Watch ▶ Video GenerationCogVideoX-2BWatch ▶

02CUDA performance

The same code, faster, all the way down.

On OxCore, OxPython does far more than translate. It orchestrates the hardware, optimizes how work is scheduled and executed, and runs the model directly on the core. Compatibility and performance together, rather than a trade-off between them.

WhereOxCore

On OxCore, OxPython is the native path. The same unchanged app is orchestrated and optimized across the three engines, OxTEN, OxSIMT, and OxORC. Compatibility and optimization in one path.

Proof: OxCore running today in FPGA, with OxPython as its native, optimizing path

Watch the demo →

OxPython on OxCore — model.py flows through OxPython to OxCore's three engines: OxTEN, OxSIMT, and OxORC

OxCore is OXMIQ's processor for AI, built from three engines. Learn more about OxCore →

03A fundamental step up

A new class of capability.

Once software and silicon are this tightly joined, OXMIQ-native solutions can fully exploit running the model on the core itself, opening capability that no compatibility layer alone could reach. This is where OxPython stops being a portability story and becomes the delivery vehicle for OXMIQ's founding vision.

WhereOxCore

The OxORC orchestrator drives the model down through the stack and runs it on the core itself. This is the all the way down step, where OXMIQ-native solutions reach capability a compatibility layer alone cannot.

Direction: From Atoms to Agents, Raja Koduri's thesis for OXMIQ

OxPython on OxCore — model flows through OxPython and OxORC down to OxCore silicon

Steps one and two run today. Step three is the direction the first two make possible: the world's AI in the door, OXMIQ's own solutions taking it the rest of the way.

04Why it matters

The value to customers and partners.

No porting, no rewrite

Your existing AI stack runs as is. The cost and risk of switching hardware largely disappears.

Freedom of choice

Your software stops being tied to one vendor's hardware. You get real optionality on price, availability, and performance.

It runs the real ecosystem

The mainstream inference stacks teams actually use, including PyTorch, vLLM, and Hugging Face, working out of the box.

And then it goes deeper

On OxCore, the same software is optimized and executed all the way down into the silicon. Compatibility and performance, in one path.

05Built with and for AI-accelerated developers

Familiar foundations. The control to go deep.

OxPython meets developers where they already are. It is built on the technologies AI developers know and trust, including CUDA, Triton, and PyTorch, so there is no new language or mental model to adopt. The skills the world already has carry straight over.

It is also built for how development actually happens now: AI-accelerated, with developers and AI assistants building side by side. OxPython gives them, and the AI working alongside them, both the familiar foundations and the low-level control to build highly optimized solutions, fast. Easy to start with, deep enough to go all the way down.

CUDATritonPyTorch

Familiar technologies. The skills to go deep. Built for AI-accelerated developers.

◆ The world's AI software, running anywhere

Run everything on

Request a demo →