rayvoc.ai
pricingvoice-ai

Voice AI Pricing Explained: What You Actually Pay Per Minute

· Rayvoc Team

If you’ve shopped for a voice AI platform, you’ve seen the headline rates: “$0.05 per minute.” It sounds almost too cheap to be true — and it is, because that number is usually the orchestration fee only. It covers the platform’s pipeline plumbing, not the speech recognition, the language model, the voice synthesis, or the phone call itself.

By the time a real call connects, you’re paying four or five vendors at once, and the all-in number lands somewhere between $0.13 and $0.31 per minute for typical production configurations. At meaningful volume, that gap between the advertised rate and the real rate is the difference between a viable unit economics story and a budget surprise.

This post tears down the full cost stack, walks through a worked example at 2,000 minutes per month, and gives you a checklist of questions to put in front of any vendor before you sign.

The voice AI cost stack: five line items, not one

Every production voice agent call passes through (at least) five billable layers. Here’s what each one does and what it actually costs in mid-2026:

LayerWhat it doesTypical cost per minuteWho bills you
Platform / orchestrationPipeline management, turn-taking, endpointing, call control$0.05–$0.10Voice AI platform
Speech-to-text (STT)Transcribes caller audio in real time$0.01–$0.02Deepgram, AssemblyAI, etc.
LLMGenerates the agent’s response$0.003–$0.08OpenAI, Anthropic, xAI, etc.
Text-to-speech (TTS)Synthesizes the agent’s voice$0.02–$0.07ElevenLabs, Cartesia, etc.
TelephonyCarries the actual phone call (PSTN/SIP)$0.01–$0.02Twilio, carrier, or platform

A few things jump out from this table.

The LLM line is the wildcard

LLM cost per minute varies by more than 25x depending on model choice. A small, fast model handling a structured intake call might cost a third of a cent per minute. A frontier model with a long system prompt, tool calls, and retrieved context can hit $0.08/minute on its own. This is also why platforms that lock you into one model deserve scrutiny — your ability to swap in a cheaper model that’s good enough for the job is one of your biggest cost levers. (It’s also why Rayvoc supports any OpenAI-compatible LLM.)

Premium voices are a real line item

TTS is the layer people underestimate. Ultra-realistic voices from the premium providers run $0.05–$0.07 per minute of synthesized speech — sometimes more than the LLM and STT combined. If your agent talks a lot (think outbound qualification scripts), voice choice materially moves your blended rate.

Telephony looks small until it isn’t

A cent or two per minute sounds trivial, but it’s billed on total call duration, not just the time the AI is “thinking.” And if your platform resells Twilio with a markup — or charges a surcharge to connect your own carrier — that small line quietly grows.

Worked example: 2,000 minutes per month

Let’s price a realistic deployment: an inbound support agent handling 2,000 minutes per month (roughly 500 calls at 4 minutes each), using a mid-tier configuration.

Line itemRate/minMonthly (2,000 min)
Platform orchestration$0.07$140
STT (streaming)$0.015$30
LLM (mid-tier model)$0.02$40
TTS (quality voice)$0.04$80
Telephony (inbound DID)$0.015$30
All-in$0.16/min$320/mo

That’s more than triple the advertised “$0.05/minute” — and this is a moderate configuration. Swap in a frontier LLM and a premium voice and you’re at $0.25–$0.31/minute, or $500–$620/month for the same traffic.

Two more wrinkles to model:

  • Concurrency. Most platforms cap simultaneous calls per plan tier and charge for additional channels. If your traffic is bursty (everyone calls at 9am), you pay for peak concurrency, not average.
  • Minimum platform fees. The industry has been moving away from pure usage pricing. Bland, for example, moved to $299–$499/month platform tiers in December 2025. A monthly platform fee on top of per-minute usage changes the math significantly at low volumes — at 500 minutes/month, a $299 platform fee alone is $0.60/minute before a single model is invoked.

The hidden fees nobody puts on the pricing page

Beyond the five-layer stack, there’s a category of charges that only show up on your first invoice. Watch for these:

Transfer-time billing

When your agent does a warm transfer to a human, many platforms keep the meter running on the AI leg for the entire human conversation — even though the AI’s job ended at the handoff. On a 2-minute AI conversation followed by a 10-minute human call, you can be billed for 12 minutes of “AI time.”

SIP and BYOC surcharges

Want to bring your own carrier to use your negotiated rates? Several platforms charge an extra per-minute fee for the privilege of not using their telephony — effectively taxing you for the savings. (Rayvoc charges no BYOC surcharge; more on why in our BYOC guide.)

Per-second vs. per-minute rounding

A platform that bills in full-minute increments inflates a 61-second call into 2 billed minutes — a 49% markup on that call. Across thousands of short calls, rounding policy alone can move your effective rate by 15–25%.

”Included” minutes that aren’t

Bundled minutes often exclude telephony or apply only to platform fees, with model costs passed through at marked-up rates. Read the definition of “minute” in the order form, not the pricing page.

Phone number and compliance fees

Per-DID monthly charges, regulatory recovery fees, CNAM lookups, and number porting costs all stack up — especially internationally, where DID numbers in some countries cost 10x US rates.

Questions to ask every voice AI vendor

Before you sign anything, get written answers to these:

  1. What is the all-in per-minute cost for my exact configuration — STT, LLM, TTS, and telephony included — not just the platform fee?
  2. How is a “minute” billed? Per-second with what rounding? From call answer or from agent pickup?
  3. What happens to billing during a transfer? Does the AI meter stop when a human takes over?
  4. Can I bring my own carrier, and is there a surcharge? What about my own model API keys?
  5. What does concurrency cost? How many simultaneous calls are included, and what’s the overage?
  6. Are there platform minimums or monthly tiers on top of usage?
  7. What do phone numbers cost monthly, and in which countries?
  8. Will rates change with volume — and is there a published volume discount schedule, or is it “talk to sales”?
  9. What latency do I get at this price? Cheap configurations often mean slower models — see our latency guide for why that matters commercially.

If a vendor can’t answer #1 with a number, assume the real cost is 2–4x the headline rate.

Where Rayvoc fits

Rayvoc was built around a simple pricing position: one transparent all-in per-minute rate that includes the platform, telephony, and managed models — or a lower rate when you bring your own model keys and your own carrier (with no BYOC surcharge, ever). Because Rayvoc runs the telecom layer natively — DIDs in 100+ countries — there’s no third-party telephony markup hiding in the stack. You can see the full breakdown on our pricing page.

We’re pre-launch, and every account starts with a 14-day free trial: one concurrent channel, 100 minutes, and a real phone number. Join the waitlist to get early access.

Be first in line when we launch

Every account starts with a 14-day free trial — 1 concurrent channel, a real phone number, and full platform access.