Each time you ask ChatGPT a query, your request triggers an information relay race. Data leaves reminiscence, passes via a CPU for preprocessing, travels to a GPU for heavy computation, after which makes its manner again — and that total journey repeats for each single phrase the AI generates.
The bottleneck is structural — it means routing via among the most costly and power-intensive chips within the business on each single request. That inefficiency is precisely what XCENA, a startup with places of work in South Korea and the U.S., is making an attempt to unravel. The four-year-old startup has designed a chip that locations compute capabilities a lot nearer to DRAM — the quick, short-term reminiscence chips that retailer knowledge a processor is actively utilizing — permitting routine knowledge operations to be dealt with close to reminiscence, with out the expensive spherical journeys between CPUs, GPUs, and reminiscence.
If it really works at scale, the implications for AI infrastructure prices might be vital, which largely explains investor enthusiasm across the nation. Certainly, XCENA simply raised $135 million in a Sequence B at a valuation of $570 million, bringing its complete raised to $185 million.
XCENA CEO Jin Kim co-founded the startup in 2022 alongside CTO Dohun Kim and CPO Harry Juhyun Kim, all veterans of Samsung and SK Hynix, the reminiscence giants that offer chips powering Nvidia’s GPUs. “CPUs and GPUs have each gotten smarter over the a long time. Reminiscence by no means did. XCENA needs to alter that,” Kim mentioned in an interview with TechCrunch. “The current rise in reminiscence costs and associated shares factors to a broader shift in AI infrastructure towards memory-centric architectures,” he added. (This month, the three corporations that dominate the worldwide reminiscence chip market — Samsung, SK Hynix, and Micron — every crossed a trillion-dollar valuation for the primary time.)
XCENA is betting its enterprise on the thesis that “inference isn’t only a compute downside; it’s more and more a reminiscence scaling downside,” mentioned Kim.
XCENA’s chip, the MX1, connects to the CPU via CXL (Compute Specific Hyperlink) — primarily a devoted specific lane between the processor and reminiscence — processing knowledge earlier than it ever wants to depart the reminiscence module. It brings compute to the information, not the opposite manner round. The corporate claims that what used to require 10 servers may probably run on only one.
“Whereas GPUs excel at matrix multiplication — the heavy math behind AI mannequin coaching — a lot of the encircling knowledge orchestration, together with preprocessing, KV cache administration [the system that stores prior conversation context so a model doesn’t have to reprocess it], and knowledge caching, nonetheless runs on CPUs. Our chip handles these duties immediately inside the reminiscence module itself,” Kim mentioned.
Demand for reminiscence options has surged for the reason that second half of final 12 months, and the corporate believes the timing is working in its favor.
Conversations with a number of international reminiscence distributors are in early phases, although Kim declined to call them. The corporate’s splendid prospects are hyperscalers spending tens of billions a 12 months on AI infrastructure, the place even a small achieve in reminiscence effectivity can imply a whole bunch of hundreds of thousands in financial savings.
The MX1 continues to be a prototype. Mass manufacturing chips are scheduled to roll off Samsung’s foundry strains by the tip of 2026, with the corporate anticipating to generate income beginning in 2027.
Whereas neural processing unit (NPU) makers are competing to problem Nvidia for coaching workloads, XCENA is concentrating on the memory-intensive layer that sits beneath all of it.
XCENA’s closest rivals embrace Astera Labs and Marvell, each Nasdaq-listed corporations engaged on next-generation reminiscence connectivity. Marvell is a big, established participant already working in the identical house, Kim mentioned, including that the differentiator comes right down to mental property. “We’ve hundreds of cores,” Kim mentioned. Primarily based on public specs, Marvell’s strategy depends on a handful of general-purpose cores by comparability.
These cores are constructed on RISC-V — an open-source chip design blueprint — and optimized particularly for knowledge processing, with every core intentionally stored small and environment friendly. Past the cores themselves, XCENA designs its personal inner reminiscence hierarchy, interconnect bus, and DRAM controller — a stage of vertical integration that the majority chip corporations, together with bigger rivals, sometimes outsource.
Seoul-based VC companies Altinum and IMM Funding co-led the Sequence B spherical, together with Corstone Asia and present traders SBI Funding and Mirae Asset Capital. The corporate, which has greater than 90 employees throughout places of work in Pangyo, a tech hub outdoors Seoul, and in Sunnyvale, can also be in conversations with worldwide traders about extra funding.
Whenever you buy via hyperlinks in our articles, we may earn a small commission. This doesn’t have an effect on our editorial independence.

