@import url('https://fonts.googleapis.com/css2?family=Syne:wght@600;700;800&family=Inter:wght@400;500;600&display=swap');

AI Hardware May 19, 2026 · 9 min read

Penn's 4 fJ Light Switch Could Finally Fix Photonic AI's Hardest Problem

A team at the University of Pennsylvania has demonstrated all-optical switching at just 4 femtojoules, targeting the nonlinear activation bottleneck that has long kept photonic chips out of serious AI workloads.

The most power-hungry moment in a photonic neural network isn't moving data. It's the split second when the system has to decide whether a signal matters. That decision, called a nonlinear activation, has stubbornly required converting light back to electricity and then back again, burning time and energy at every step. A new result from Penn may have found a way around it.

Published in Physical Review Letters on April 10, 2026, the work by Bo Zhen's group at Penn demonstrates strongly nonlinear nanocavity exciton-polaritons in a gate-tunable monolayer semiconductor. The device switches optically at roughly 4 femtojoules per operation, on picosecond timescales, without touching a single electron in conversion. That combination is genuinely unusual.

It's not a chip. It's not a product announcement. What it is, according to the paper and Penn's own announcement, is a materials-level demonstration that the specific kind of nonlinearity AI needs can happen in a photonic device at energies low enough to matter for real computing. The distance between that result and a working AI accelerator is still significant. But the bottleneck it addresses has been visible for years, and no one had closed it at this efficiency before.

The Physics Behind the Breakthrough

Exciton-polaritons are hybrid light-matter particles that form when photons inside a cavity couple strongly enough to electrons in a semiconductor that the two can't be described separately anymore. They behave partly like light (they move fast, they don't interact much with the lattice) and partly like matter (they can interact with each other). That second quality is the point.

The Penn team used a single-atom-thick layer of molybdenum diselenide (MoSe2) as the semiconductor, embedded in a photonic crystal nanocavity. The gate-tunable part means the researchers can dial the coupling strength electrically, giving them precise control over when the device is in strong-coupling mode and when it isn't. That control is what makes switching possible.

Technical note: The nonlinearity in this device comes from exciton dephasing at high polariton populations. As excitation increases, more excitons interact and lose phase coherence faster, which weakens the coupling between excitons and photons. Once that coupling drops below a threshold, the device exits the strong-coupling regime entirely. That phase transition is what produces the switching behavior described in the arXiv preprint.

The key word in the paper's title is "strongly." Previous demonstrations of polariton-based switching existed, but they either required cryogenic temperatures, operated at much higher energies, or lacked the gate-tunability needed to integrate into a real device stack. The Penn result works at accessible temperatures in a structure designed with practical integration in mind.

"The platform works by coupling photons with electrons in an atomically thin semiconductor so light can interact strongly enough to perform signal switching."
Bo Zhen, Jin K. Lee Presidential Associate Professor, University of Pennsylvania — Penn Today

The photonic crystal nanocavity matters too. Unlike earlier bulk optical approaches, the nanocavity concentrates the electromagnetic field into a tiny mode volume, amplifying the light-matter interaction enough to make polaritons at excitation levels far below what previous platforms needed. Smaller mode volume, lower switching energy. That's the chain of logic that gets you to 4 fJ.

Why 4 Femtojoules Matters

Four femtojoules is 0.000000000000004 joules. To calibrate: a typical electronic transistor switch in a modern logic circuit consumes somewhere between 1 and 100 femtojoules per operation depending on process node, voltage, and load. A state-of-the-art CMOS switch in a 3nm process operates around 1 to 5 fJ. Penn's optical switch is operating in that same range.

Switching Mechanism	Typical Energy per Op	Operating Speed	Temperature
CMOS transistor (3nm)	1-5 fJ	Sub-nanosecond	Room temp
Mach-Zehnder modulator (Si photonics)	100-1000 fJ	Sub-nanosecond	Room temp
Earlier polariton switches	10-100 fJ (cryogenic)	Picoseconds	Cryogenic
Penn MoSe2 nanocavity (2026)	~4 fJ	Picoseconds	Accessible temp

The comparison to electronic switches matters because photonic AI has always had a problem with the conversion steps. You can transmit data in light at very low energy. But the moment you need a nonlinear operation, most architectures have converted back to electronics, done the computation, and then re-encoded into light. Each conversion burns energy and adds latency. If a photonic nonlinear element can operate at energies competitive with the electronics it replaces, the conversion penalty becomes avoidable.

Important caveat: The 4 fJ figure describes a single switching event in a lab demonstration, not a full inference workload or training run. Energy per operation in a deployed system depends on many additional factors: coupling losses, control overhead, memory access, and system architecture. The number establishes a lower bound on what's physically possible, not a projection of what a chip would consume.

Still, lower bounds are what determine whether a path is worth pursuing. Prior to results like this one, the lower bound for optical nonlinear switching was high enough that it was hard to argue photonics could match electronics on activation energy. That argument is harder to sustain now.

The AI Hardware Connection

Neural networks are, at their mathematical core, chains of matrix multiplications and nonlinear functions. Photonic chips have been excellent at the first part for years. Light traveling through interference patterns and beam splitters can implement matrix-vector products almost passively, with very low energy consumption per multiply-accumulate operation. That's why companies like Lightmatter and Luminous Computing have attracted serious investment: the linear math case for optical computing is real.

The nonlinear case has been the stumbling block. Every activation function, every threshold, every sigmoid or ReLU in a neural network requires a nonlinear element. In a hybrid optoelectronic system, that means a photodetector, an amplifier, a modulator, and a laser driver, all chained together. The overhead compounds with depth. A 50-layer network goes through 50 rounds of conversion. At scale, that's the dominant cost.

➕

Linear Ops (Solved)

Photonic matrix multiplications via interferometers are already highly efficient. This is the established strength of optical computing architectures.

⚡

Nonlinear Ops (The Gap)

Activation functions require nonlinearity. Until now, achieving this optically at competitive energies has been the key unsolved problem for all-optical neural nets.

🔬

Penn's Contribution

A gate-tunable polariton switch that operates at 4 fJ provides a credible materials-level route to all-optical nonlinear activation at competitive energy.

🔗

Penn's 2025 Context

Penn had already demonstrated a programmable photonic chip for nonlinear neural networks in 2025, making this a deeper materials advance, not a first concept.

Penn's announcement frames the result specifically around this gap. According to the EurekAlert press release, one application target is processing camera data directly on a photonic chip, without round-tripping through digital electronics. That's a concrete use case where the latency and energy of optoelectronic conversion is genuinely a systems problem, not just a theoretical concern.

"Many photonic AI chips still need electronic conversion for nonlinear activation steps. This result is meant to reduce that bottleneck."
Research summary, University of Pennsylvania — Physical Review Letters, April 2026

From Lab Bench to Published Science

The research didn't appear overnight. The preprint landed on arXiv in November 2024, meaning the underlying measurements were complete well before the journal publication. Peer review at Physical Review Letters took roughly five months, which is fairly typical for a result of this type.

November 24, 2024: Preprint posted to arXiv. Title confirms the MoSe2 monolayer architecture and the all-optical switching claim at the femtojoule scale.
April 10, 2026: Paper published in Physical Review Letters (DOI: 10.1103/gc15-qsvf). PubMed indexing confirms peer review completion.
April 22, 2026: Penn Today publishes "Making 'light' work of computing," contextualizing the result for a general technical audience.
May 14, 2026: EurekAlert press distribution amplifies the AI angle and the camera-chip application case.
May 18, 2026: ScienceDaily republication extends the reach to science-adjacent audiences.

The publication in Physical Review Letters matters for credibility. PRL publishes roughly 3,000 letters per year across all of physics, with an acceptance rate under 25%. The review process for a result claiming device-level nonlinear switching at femtojoule energies would have required detailed scrutiny of the measurement methodology, the energy calibration, and the claims about the physical mechanism. The fact that it cleared that bar doesn't make it a finished technology, but it does mean the core numbers survived independent expert review.

What Still Stands in the Way

The skeptical reading of this result is both fair and important. A single device switching at 4 fJ in a well-controlled lab environment is not the same as a manufacturable nonlinear element in a deployed photonic AI chip. The gap between those two things involves several distinct engineering challenges, none of which the Penn paper claims to solve.

2D Material Yield and Uniformity

MoSe2 monolayers are grown or exfoliated, not printed from a mask like silicon. At production scale, getting consistent coupling quality across thousands or millions of nanocavities on a single die is an unsolved manufacturing problem. Defects in atomically thin materials produce wildly variable device performance.

Silicon Photonics Integration

The dominant photonic chip platform globally is silicon photonics, built on CMOS-compatible fabs. MoSe2 on a photonic crystal nanocavity is not natively compatible with that stack. Hybrid integration is possible in principle, but it adds process steps, reduces yield, and complicates packaging.

Control Electronics Overhead

The gate-tunable architecture requires electrical control signals to set the coupling condition. That control circuitry consumes power and adds complexity. The 4 fJ switching energy figure doesn't include the overhead of generating and routing those gate signals at scale.

Operating Conditions

The paper doesn't specify room-temperature operation explicitly in the abstract. Earlier polariton results required cryogenic cooling, which would make commercial deployment impractical. Penn's framing suggests accessibility, but independent replication under varied conditions will be needed before that's confirmed.

Perspective: A 2022 analysis in Physical Review Applied on photonic neural network energy efficiency found that switching energy figures from individual device demonstrations often don't account for system-level overhead, interconnect losses, and control costs. The field's history includes several "breakthrough" devices that looked impressive in isolation but didn't scale to useful system architectures.

Where This Fits in Photonic Computing

Photonic computing has had an interesting trajectory. The basic idea has been around since the 1980s, went through a hype cycle in the late 2010s alongside the first wave of AI hardware investment, and has since stratified into a few distinct camps: analog optical matrix processors (Lightmatter, Luminous), optical interconnects for data centers (Ayar Labs, Celestial AI), and fundamental physics research into all-optical computation (largely academic).

Penn's work sits firmly in the third category, with implications for the first. The 2025 Penn photonic chip for nonlinear neural networks was a system-level demonstration that this research group is thinking about practical architectures, not just device physics. The 2026 PRL paper goes one level deeper, to the materials and mechanism question: what physical system can provide the nonlinearity at the energy scale that would actually help?

That's a useful sequence. System-level work identifies the problem precisely; materials-level work attacks the root cause. The risk in photonic computing historically has been inverting that order, demonstrating clever materials without knowing where they fit in a real architecture. Penn appears to be working it in the right direction.

The broader field is watching. The nonlinear element problem isn't unique to Penn's approach. Other groups are pursuing phase-change materials, carrier-injection-based silicon modulators, and nano-optomechanical effects. Each has different tradeoffs on speed, energy, and manufacturability. Penn's polariton approach is now among the most energy-efficient demonstrations on record, which changes the competitive landscape for that specific figure of merit.

Frequently Asked Questions

What are exciton-polaritons and why do they matter for computing?

Exciton-polaritons are hybrid quantum particles that form when photons couple strongly to electron-hole pairs (excitons) in a semiconductor. They inherit properties from both light and matter, including the ability to interact with each other. That interaction is what enables nonlinear optical behavior, the key function needed for neural network activation layers.

How does 4 femtojoules compare to existing AI chip energy use?

Modern CMOS transistors in leading-edge processes switch at roughly 1 to 5 fJ per operation. Penn's 4 fJ optical switch is competitive with that figure, which is significant because earlier optical nonlinear devices typically consumed 10 to 1,000 times more energy per switch. The comparison only holds for the switching event itself, not full system power.

Does this mean photonic AI chips can replace GPUs?

Not yet, and not directly. This result addresses one specific bottleneck in photonic computing, the nonlinear activation function, at the device physics level. A complete AI accelerator requires memory, control logic, interconnects, and a manufacturable process. The Penn work is a necessary but not sufficient step toward that larger goal.

What is a photonic crystal nanocavity?

A photonic crystal nanocavity is a precisely engineered structure with a periodic pattern of holes or features that traps and concentrates light in a tiny volume. By reducing the mode volume, it strengthens the interaction between light and any material inside, making effects like strong exciton-photon coupling achievable at much lower optical power levels.

Who is Bo Zhen and what group did this work?

Bo Zhen is the Jin K. Lee Presidential Associate Professor at the University of Pennsylvania. His group works at the intersection of photonics and quantum materials. The 2026 Physical Review Letters paper is part of a broader research program that also produced a programmable photonic chip demonstration in 2025, establishing the group as a leader in applied photonic computing research.

What semiconductor material is used in the Penn device?

The device uses a monolayer of molybdenum diselenide (MoSe2), a transition metal dichalcogenide. In its single-atom-thick form, MoSe2 has strong light-matter interaction properties that bulk semiconductors lack. The monolayer is placed inside a photonic crystal nanocavity and coupled via an electrostatic gate that allows researchers to tune the coupling strength.

What are the main barriers before this technology reaches commercial AI hardware?

The main barriers are: consistent manufacturing of atomically thin MoSe2 at scale, integration with standard silicon photonics fabrication lines, control electronics overhead not captured in the switching energy figure, and verification of room-temperature operating conditions. Each is a meaningful engineering challenge requiring dedicated research programs.

How long has photonic computing been researched as an AI hardware approach?

Optical computing concepts date to the 1980s. The modern wave of interest in photonic AI hardware accelerated around 2017 to 2019, coinciding with the first deep learning hardware boom. Since then, companies like Lightmatter and academic groups at MIT, Stanford, and Penn have focused on specific solvable subproblems, with nonlinear activation being a persistent open question until recently.

What Comes Next

The Penn result doesn't collapse the distance between a lab device and a commercial AI chip. It does something more modest and more durable: it removes one item from the list of reasons that distance seemed impassable. The nonlinear activation problem at the physics level now has a credible, peer-reviewed answer at femtojoule energies. That's not the same as a product, but it's a prerequisite for one.

The photonic computing field's history is littered with demonstrations that went nowhere because they solved isolated device problems without addressing the system architecture around them. Penn's sequential research program, from nonlinear neural network chips to the deeper materials question, suggests a group that understands that failure mode. Whether the MoSe2 nanocavity platform survives contact with manufacturing reality is the next test. That test will happen in fabs, not in physics journals.

For AI hardware, the more immediate implication isn't about training frontier models on light. It's about inference at the edge. Camera chips, sensor arrays, robotics, and medical imaging all involve scenarios where processing speed and power consumption matter more than raw throughput, and where the cost of optoelectronic conversion is a real design constraint. If Penn's platform can be integrated into those architectures even partially, the payoff starts before any general-purpose optical GPU exists.

Watch For

01 Independent replication of the 4 fJ figure at room temperature, confirming operating conditions that matter for commercial viability. Expect preprints from competing groups within 12 to 18 months.

02 Penn or a partner fab demonstrating multi-device arrays with consistent coupling performance, the minimum threshold for any useful photonic circuit beyond single-device demonstrations.

03 Photonic AI chip companies (Lightmatter, Celestial AI, others) publicly addressing the nonlinear element strategy in their roadmaps. This result changes the viable options for how they architect activation layers.

04 DARPA or DOE program announcements targeting 2D-material photonic integration. Results at this energy level typically attract defense and national lab funding within 1 to 2 years of peer-reviewed publication.

Stay ahead of the curve. More on photonic computing and AI hardware at NeuralWired.

Explore AI Hardware

Next Gen IT Blog

Tuesday, 19 May 2026

Photonic AI Chip Breakthrough: Penn's 4 fJ Switch (2026)