Duplex research

Benchmarks for realtime voice agents.

A public lab notebook for the Duplex thesis: benchmark the best realtime voice stacks, then route each customer to the right provider for the use case.

Featured benchmark thesis

Why Duplex benchmarks realtime voice stacks instead of betting on one provider

The market is moving too fast to hard-code one model. Duplex should compare OpenAI Realtime, Gemini Live, PersonaPlex, Pipecat pipelines, and emerging speech-native models against the actual use case: routed voice agents for communities and agent teams.

Realtime voiceOpenAI RealtimeGemini LivePersonaPlexPipecat

Read the benchmark argument

Benchmark the stack

Compare realtime voice candidates against the same receptionist routing script.

Publish the market map

Explain where Discord, e-commerce, DevOps, creators, and agent teams actually need voice.

Route to deployment

Turn research into provider recommendations and use-case-specific deployment paths.

Product 2026-05-125 min read

Receptionist routing is the killer demo for voice agents

A voice agent should not just answer. It should route. The Duplex demo starts with a receptionist, transfers to specialists, returns home, and carries context across every handoff.

Read post

Make the playground a living benchmark.

Each benchmark post should link back to the Playground, publish the test script, score the provider, and end with a clear deployment recommendation: managed OpenAI path, multimodal Gemini path, self-hosted PersonaPlex path, Pipecat adapter path, or another SOTA candidate.