Light, the Grid, and the British Company That Might Break the AI Energy Trade

The Sunday Signal | Issue #55 | Week 21

May 24, 2026

Sunday, 24 May 2026.

This issue is also available as a podcast. Listen on Spotify, Apple Podcasts or YouTube and tell me what you think.

The full interview with Dr Xianxin Guo, CEO of Lumai, is published separately on Substack today. Read it here.

Bottom Line Up Front

The whole AI story this decade has rested on a single assumption. That the machine is fixed, the energy bill is rising, and the only choice left is whether to build more power stations or accept the limit. That assumption is wrong, and a company in Oxford has the receipts to prove it. The implications run further than the data centre. They reach the trillions of dollars currently sitting on the bet that AI will burn through everything we can generate, and they reach the British question this newsletter has been worrying at for months: can we, this time, keep the thing we invented.

The AI energy crisis is a physics problem in an economics costume

Every argument about AI and power ends in the same place. Demand doubles by 2030. The grid cannot carry it. Build more, burn more, and hope the curve bends. That is a supply argument. It assumes the machine is fixed and the only variable is how much we feed it.

It is the wrong argument. The energy problem is not really about supply. It is about what we built the machine out of.

Electrons resist. That is the whole story

A chip computes by pushing electrons through transistors and switching them on and off billions of times a second. Every switch costs energy. Moving the data between memory and processor costs more energy than the sums themselves. Almost all of it leaves as heat. That is why a modern AI data centre spends a punishing share of its power not on thinking but on staying cool. The electron is a workhorse that throws off heat with every step. We have spent sixty years making it take smaller steps. We are running out of room.

Photons do not behave that way. Light moving through an optical component does not dissipate heat the way a switching transistor does, and it can carry many calculations at once, in parallel, through space and across wavelengths. The heaviest single job in AI inference is matrix multiplication, the same arithmetic over and over at vast scale. That job maps almost exactly onto what light does without being forced. The dominant power cost in this kind of optical system is not the optical multiplication itself, but the conversion between the electrical and optical domains at the edges. The efficiency lever, then, is to do as much maths as possible per conversion. Lumai says its optical tensor engine performs matrix multiplication up to 2048 by 2048. Bigger matrices per conversion means more sums per joule. That is where the order-of-magnitude saving actually comes from.

The company that bet everything on that sentence

The company is Lumai. It is a spinout from the University of Oxford, headquartered in Oxford, and last month it launched a server called Iris Nova. By its own account, it is the only system in the world running billion-parameter large language models on optical compute, currently doing real-time inference on Meta’s Llama 8B and 70B.

I first saw Lumai a fortnight ago at the Royal Academy of Engineering Enterprise Hub Demo Day. Dr Xianxin Guo was on stage, beside a slide of the Lumai Iris Server. The slide carried two forward claims against the hardware the entire industry currently runs on. Ninety-five per cent less energy than an Nvidia GB200 by 2029. Ninety per cent lower cost than Nvidia by 2029. Aggressive numbers, the kind you note and then chase. I followed up.

I sat down with him this week to ask the questions readers of this newsletter actually care about. Power. Cost. Speed of adoption. And whether Britain keeps it.

His one-line answer to what the company does, in language any data centre operator would recognise: ninety per cent less energy, ten per cent of the cost, more AI tokens out of the same hall.

Dr Xianxin Guo Presenting at The Royal Academy

The ninety per cent, translated

“Ninety per cent reduction of the power needed for the same compute at the rack appliance level,” Guo says. Less power means less cooling, smaller power supplies, less of the secondary plant that quietly drinks a material share of a data centre’s electricity. The saving ripples. Networking and other elements outside Lumai’s box are not affected, so the figure at the grid connection comes in below ninety, but Guo is direct about what the figure actually buys you. “What this is likely to mean in practice is more compute capacity within the same power budget.”

That is the honest version of the answer. The energy saving does not collapse the energy bill. It expands the amount of intelligence the same energy bill can produce. He is open about why.

“Jevons’ paradox is real, and we will see more AI used, not less.”

This is the part the optimists tend to leave out. Efficiency does not bend the curve. It steepens it. The grid still has to grow. What changes is what we get for it.

Cost per token, not cost per box

“Cost per token. That is the only metric that matters to an operator running this at scale,” Guo says when pressed on what the ninety per cent saving actually means in pounds. The figure is a blend of equipment capital cost and the capital cost of the data centre that hosts it. Power. Backup generators. Cooling. The whole sleeve of infrastructure that exists because silicon throws off heat.

The model that lands first is unsurprising. Operators with the highest token volumes and the heaviest cost exposure to inference feel the saving first. That is the hyperscaler. That is the neocloud. That is the customer.

The capital question is sharper. Lumai has raised about fifteen million dollars. American rivals carry war chests forty times that. Guo’s answer is that the architecture itself bends the capital problem.

“We are not building custom silicon at the leading chip node, because most of the hard computation is performed in optics.”

Translation: they do not need a TSMC-class fabrication line because the heavy lifting is not happening on a transistor. The architecture is lens-based and three-dimensional rather than integrated photonics on a 2D chip. Weights are encoded on an electronic display and the calculation happens as light passes through it. The supply chain leans on mature, high-volume components, the kind of optics and displays that are already manufactured at industrial scale elsewhere, customised for the use but not invented from scratch. It is a deliberate choice to sidestep the part of the hardware industry that eats capital fastest, the development of novel materials and bleeding-edge fabrication. They have just opened the next round.

How fast this moves

Iris Nova is in the hands of evaluators now. Guo is direct about the timeline from there. Eighteen to twenty-four months from first evaluation to production at scale at a hyperscaler. Then this:

“The true blocker isn’t the technology, it is validation. A hyperscaler will not deploy a novel architecture into a revenue-bearing workload without months of validation, integration testing, and operational confidence.”

That sentence should reassure rather than worry. The bottleneck is not whether the physics is real. It is whether the buyer can convince itself. The first real deployment will be a cluster in a national laboratory testing performance across multiple workloads and orchestration patterns. It is the necessary unglamorous step before the order book opens.

The form factor is built to make that confidence easy to give. Iris Nova is an Ethernet-attached appliance that drops into a standard rack and runs alongside existing GPUs rather than replacing them. Its natural fit is inference prefill and other compute-bound workloads. Nobody is being asked to rip Nvidia out.

The CUDA question, the moat that has protected Nvidia for a decade, gets a short answer too. Same frameworks, same models, no rewrite. Lumai compiles to its hardware and the model runs.

A pull quote that has nothing to do with chips

The most quietly striking line in the conversation was about leadership. Guo was Head of Research before he became CEO last year. What did he have to unlearn?

“A decision made in a week with 70 per cent certainty beats a decision made in three months with 95 per cent.”

A British deep-tech founder who has read the room about why British deep-tech keeps losing.

The full interview is published on Substack today, here.

What happens to the grid if nothing changes

The Lumai story matters because the alternative has now been forecast in detail by people who do not write newsletters for a living. Without an architectural pivot, here is the road we are on.

The global picture

The International Energy Agency puts global data centre electricity consumption at roughly 415 terawatt hours in 2024 and projects it to more than double to around 945 terawatt hours by 2030, driven heavily by AI. That takes data centres from about one and a half per cent of global electricity demand to roughly three per cent inside six years. Goldman Sachs Research projects a 165 per cent total increase in data centre power demand between 2023 and 2030. McKinsey estimates the global investment needed to physically build that capacity at nearly seven trillion dollars before the decade is out.

The unit of measurement has changed. Standard IT racks have drawn ten to fifteen kilowatts for years. AI-optimised racks now push past forty. A single next-generation AI campus currently under construction will demand around twenty times the power of a typical hyperscale site. The grid is not being asked for more of the same. It is being asked for something it has never had to deliver.

The British picture is sharper

In the UK, the boom is already crossing thresholds the system was not designed to cross. Recent industry estimates put UK data centres at around six per cent of electricity consumption, although the House of Commons Library’s 2025 briefing gives a lower figure of 2.5 per cent. Either way the trajectory is steep. The National Energy System Operator projects a fourfold increase in data centre electricity demand by 2030. Other UK forecasts go to fivefold. In London alone, 271 data centres are drawing 69 per cent more power than the capital’s 3.49 million homes put together.

The Department for Science, Innovation and Technology has had to revise its forecast for greenhouse gas emissions from AI data centres upward by a factor of 100. Not by a per cent. By a multiple. Some UK data centre developers are now being quoted grid connection wait times of up to ten years.

That is not a forecast. That is a queue.

The concentration is the problem

The numbers above are global averages. The actual strain is hyper-local. The United States is expected to move from three to four per cent of national power consumption today to between eight and twelve per cent by 2030. In Ireland, data centres already consume over twenty per cent of the national grid. In Singapore, it is nineteen.

This is what people mean when they talk about an energy wall. They mean specific countries and specific cities where the grid cannot, physically, supply what the next decade of AI demand is forecast to require.

What an aggressive pivot to photons actually changes

Now run the alternative. If the architectural shift this newsletter has been describing actually happens at scale by 2028, the forecasts above stop being inevitable.

There are two places the energy bill of an AI data centre is concentrated. The first is data movement. By early 2025, pushing data through copper interconnects was consuming close to thirty per cent of total AI cluster power. Co-packaged optics, which moves the light conversion onto the chip itself, is now in volume production between 2026 and 2028 and cuts switch and rack-level power by an estimated forty to fifty per cent. The second is the compute itself. Moving the matrix mathematics from electrons to photons delivers, on independent estimates, an order of magnitude improvement in energy efficiency. Some photonic processors are demonstrating thirty times the savings in narrow benchmarks. Then the cooling collapse, because if the chip throws off less heat the secondary plant that exists to keep it cool throws off less of itself.

If you apply a blended fifty to ninety per cent reduction across interconnects and inference compute, two scenarios open up.

Scenario one. The reprieve. If global AI token demand grows at exactly the rate current economic models project, the IEA’s 945 terawatt hour figure never gets reached. Demand plateaus inside the existing envelope. The ten-year UK grid connection queue loses one of its biggest accelerants. The seven trillion dollars in additional power infrastructure is not needed at that scale.

Scenario two. The Jevons explosion. We still hit 945 terawatt hours by 2030, because efficiency releases demand, but we get fifty times the AI intelligence out of it. The total power bill looks the same. The cost per token collapses toward a level that rewrites the economics of inference. Every operator still running legacy silicon is suddenly competing on infrastructure economics they cannot win.

Guo’s own answer pointed firmly at scenario two. The history of every previous efficiency revolution in computing points the same way. The energy bill rarely shrinks. The work the energy buys multiplies.

Either scenario makes one specific category of investor wrong about a great deal of money.

The trillion-dollar trade built on a single assumption

Artificial Intelligence (AI) in Energy and Power Market Report 2026

Wall Street has spent two years pricing one outcome into the market. That hyperscalers, locked into a brute-force scaling race against each other, would buy raw electricity at any premium because they had no choice. The whole “AI energy trade” rests on that single load-bearing thesis. If photonics breaks the assumption, the financial consequences are not gentle.

The collapse of the AI premium in utilities

Independent power producers, nuclear fleet operators, regional utilities. All have been trading at historically elevated multiples because analysts have priced in a 150 to 200 per cent increase in data centre demand by 2030. The premium is not in current cash flow. It is in the expected future cash flow from desperate, decade-long power purchase agreements.

The repricing event is identifiable. The moment a major hyperscaler announces that its 2030 power forecast has been cut by, say, seventy per cent because optical coprocessors are doing more of the work, the repricing begins. Base-load demand will not fall to today’s levels. The growth premium will be forced into question very quickly. Slow-growth utility valuations return. Anyone holding the trade on margin discovers the difference.

The cooling sector takes the worst of it

The most exposed corner of the market is thermal management. Liquid cooling specialists, immersion cooling startups, the hyperscale HVAC contractors. The whole sector exists because silicon GPUs are, for accounting purposes, very expensive space heaters. If the heat goes away, so does the addressable market. Operators do not need next-generation cooling architecture for a chip that is barely warm. The pricing in this corner has not begun to discount that risk.

The copper downgrade

The copper trade is leaning on two stories, electric vehicles and AI grid expansion. The second of those is contingent. Optical computing replaces copper at the chip, the rack and the cluster. If grid expansion is paused because power demand plateaus, the projected supply deficits ease, and the mining and commodities exposure that has been priced for a structural deficit corrects.

Where the money goes

Capital does not vanish in repricings. It rotates. The destination, if this scenario plays out, is the optical supply chain. The mature manufacturers of lasers, modulators, waveguides, the photonic foundries, the firms quietly making the components that startups like Lumai assemble into systems. They are the picks and shovels of the next phase. Anyone trying to identify them in retrospect will pay a multiple to do so.

I am not in the business of giving investment advice. I am in the business of pointing out when a market is priced for a future that one British company is currently building the alternative to. Do the work yourself.

Why Lumai must stay a British company

This is the part of the story this newsletter cares about most, because we have watched it play out before. The UK invents the thing. American capital buys it. The headlines fade. The technology becomes the basis of a US-listed giant. Britain is thanked for the talent and reduced, politely, to a regional R&D office.

The pattern is now well documented enough to be predictable.

DeepMind, the most important AI research company in the world, came out of University College London. In 2014 Google bought it, pre-revenue, for around 400 million pounds. The acquisition was barely commented on at the time. It became the foundation of what is now Google DeepMind, the engine of one of the most valuable companies on the planet.

VocalIQ, spun out of Cambridge’s Spoken Dialogue Systems Group, was acquired by Apple in 2015 to become the conversational intelligence inside Siri. Magic Pony Technology, founded by London university graduates and seeded with a grant from Innovate UK, was bought by Twitter in 2016 for around 150 million dollars, two years after starting. Latent Logic, an Oxford spinout in autonomous decision-making, was acquired by Alphabet’s Waymo in 2019.

That is the soft pattern. The harder pattern is more recent and more damning.

In September 2025, Oxford Ionics, the trapped-ion quantum computing spinout, was acquired by the US-listed IonQ for 1.075 billion dollars. It was the highest-value Oxford quantum exit in history. The founders are expected to stay in Oxford. The IP, the headquarters, the future market capitalisation, and the corporate tax base now belong to a Maryland company. In the same announcement, IonQ also bought Vector Atomic, a quantum sensing firm with more than 200 million dollars in US defence contracts. Britain handed a sovereign-grade quantum capability to a defence-adjacent American competitor and clapped.

Oxford Quantum Circuits, the other Oxford quantum scaleup, raised the largest UK Series A in quantum, 38 million pounds, co-led by Lansdowne Partners and Japan’s largest deep-tech fund. British Patient Capital put in 6.7 million of it. The rest came from abroad.

The fundamental issue is not invention. Britain invents prolifically. The fundamental issue is scaling capital. When the moment comes to write the cheque that takes a company from breakthrough to market dominance, the British cheque is either not there or not large enough. American capital is. The technology, the headquarters, the tax base and the strategic capability follow the cheque.

This is what economists call the public risk, foreign reward deficit. The British state and the university system absorb the high-risk research phase. Once the technology is de-risked, American capital buys the certainty. The returns are then booked on NASDAQ.

Guo knows all of this. His answer when asked if Lumai stays British was the one this newsletter has been waiting to hear.

“Where the company is built is the question that matters, and the answer is the UK.”

His specific demand of the British state was equally clear. Not subsidy. Procurement.

“A committed UK government anchor purchase of Lumai Iris server hardware into the AI Research Resource. Concrete orders from a sovereign customer do more for keeping the company here than any number of strategy papers.”

ARIA‘s Scaling Inference Lab is a real testbed for novel AI hardware. The Sovereign AI programme has half a billion pounds behind British AI companies trying to scale. The Department for Science, Innovation and Technology has a hundred-million-pound package that explicitly positions government as a first customer for high-quality UK AI hardware. The instruments exist. The question is whether they are deployed in time, or whether a year from now we will be reading the press release announcing Lumai’s acquisition by an American hyperscaler.

This is not hostile to America. This is observable to anyone with a chronology and an attention span.

The Sunday Signal Tech and AI Layoff Tracker

Week 21. 17 to 23 May 2026.

The mask slips on “lower-value human capital”

For months we have tracked executives carefully constructing a PR shield around AI-driven job losses, relying on sanitary phrases like “operational leverage.” This week the shield was shattered by a 170-year-old British bank.

Standard Chartered announced a phased target to eliminate roughly 7,800 corporate and support roles, fifteen per cent of that division, by 2030 to fund an aggressive AI integration. The headcount was not the story. The language was. CEO Bill Winters told investors the bank was replacing “lower-value human capital” with the bank’s investment in technology and AI.

The backlash was instant. Winters issued a Wednesday memo attempting to walk back the phrasing, telling staff the words had been taken out of context and that role losses reflected the changing nature of the work, not the value of the people doing it. The memo is classic corporate laundering. The investor call was the unvarnished reality of the 2026 board room. “Lower-value human capital” is exactly how the C-suite now views non-technical administrative labour in the age of large language models.

The collapse of offshore arbitrage

The most critical takeaway is geographic. The planned cuts are disproportionately targeted at the bank’s major back-office hubs: Chennai, Bengaluru, Kuala Lumpur and Warsaw. Treat the specific list as indicative until the bank confirms it.

This signals the death of the defining corporate strategy of the last quarter century. Labour arbitrage. For decades, Western corporations saved billions by shipping administrative tasks to business process outsourcing centres in Asia and Eastern Europe. In 2026 the maths has flipped. Running an AI agent on a localised server is now cheaper than paying an offshore human analyst. AI does not require a desk. It does not sleep. It does not care about time zones.

Reality check this week: Intuit, Meta, Dune

While Standard Chartered laid out a multi-year roadmap, the immediate tech sector continued its contraction.

Intuit. The AI flush. On 20 May, the financial software giant announced the elimination of roughly 3,000 jobs, seventeen per cent of its workforce. This is a textbook AI-native restructure: clearing legacy overhead to rehire directly into AI engineering and product roles.

Meta. The execution phase. The mega-layoff Zuckerberg announced earlier this month is no longer theoretical. Rolling cuts began on 20 May, actively eliminating 8,000 roles, ten per cent of the workforce, to free up capital for the company’s $135 billion AI infrastructure buildout.

Dune Analytics. The Web3 pivot. This crypto analytics platform slashed twenty-five per cent of its staff this week. The stated rationale was a strategic pivot toward “AI and institutional onchain data.” When a Web3 startup and a colonial-era British bank cite the same technological driver to cut headcount, it stops being a coincidence. It becomes a sector-agnostic law of physics.

High-probability targets, Week 22

The global IT services giants. Standard Chartered just fired a warning shot directly over the bow of the major Indian IT hubs. The largest BPO and service providers, TCS, Infosys, Wipro, are now in acute danger. If their Western banking clients are successfully automating their own back-office compliance functions, the outsourcing contracts that sustain these IT giants will dry up. Expect defensive, margin-protecting layoffs from the BPO sector to accelerate immediately.

Mid-tier financial compliance. Any vendor built around providing human-in-the-loop KYC and AML verification is operating on borrowed time. The ecosystem servicing these manual workflows will be forced to downsize before Q3.

Methodology. The Standard Chartered figure is an announced multi-year target to 2030, recorded under broader corporate. The arithmetic of around 7,800 is derived from the bank’s stated fifteen per cent of roughly 52,000 staff. Broader corporate figures draw on LayoffHedge. Tech-specific figures draw on Layoffs.fyi and TrueUp.

Final Thought

🚀 We have spent three years arguing about how to feed the machine. More power stations. More grid. More money set on fire to keep the lights on in the server hall. The more useful question was always the quieter one. Not how much energy the machine needs, but what we chose to build it out of.

One answer to that question is sitting in a laboratory in Oxford. The CEO is a physicist who decided he would rather build the company than watch someone else buy it. He has told the British state, in public, what would keep him here. He has put a number on it, a date on it, and a programme on it. There is no possible defence for not hearing him.

His closing instruction to any policymaker reading this newsletter does not bother with diplomacy.

“Advanced procurement of UK-built AI hardware into the AI Research Resource and the Growth Infrastructure programme this financial year, or watch the next generation of AI compute be created outside of the UK again.”

This financial year. Not the next strategy review. Not the next white paper. This one.

The bank in the tracker called its own people lower-value capital. The British company at the top of this issue offered a way to make the machine that replaces them use a fraction of the energy. The cut and the fix are the same story. The only question left is who gets to own the fix.

Britain has watched DeepMind, Magic Pony, VocalIQ, Latent Logic and Oxford Ionics walk out of the door. Light is the next one. Buy the thing, or read the press release.

The Sunday Signal. Weekly, every Sunday. Subscribe at thesundaysignal.ai.

Listen to the podcast on Spotify, Apple Podcasts or YouTube.

The full interview with Dr Xianxin Guo, CEO of Lumai, is published in full on Substack today. Read it here.

Until next Sunday, David Richards MBE.

Discussion about this post

Ready for more?