Head-to-head comparison based on real production benchmarks with Gulf Arabic callers.
High-quality Arabic STT from Google Cloud, but with significant latency.
ElevenLabs' realtime STT offering — poor quality and slow for Arabic.
High quality transcription. Broad Arabic dialect support through ar-XA language code.
Described as 'shit quality' in production testing. Not viable for Arabic.
| Feature | Google Cloud STT — Chirp 3 | ElevenLabs Scribe v2 |
|---|---|---|
| Real-time streaming transcription | ✓ | ✓ |
| 120+ language support | ✓ | ✗ |
| Automatic punctuation | ✓ | ✗ |
| Word-level timestamps | ✓ | ✗ |
| Speaker diarization | ✓ | ✗ |
| Custom vocabulary | ✓ | ✗ |
| Medical and telephony models | ✓ | ✗ |
| Multiple language support | ✗ | ✓ |
| LiveKit inference integration | ✗ | ✓ |
| Capability | Google Cloud STT — Chirp 3 | ElevenLabs Scribe v2 |
|---|---|---|
| Streaming support | ✓ | ✓ |
| LiveKit plugin | ✓ | ✓ |
| Self-hostable | ✗ | ✗ |
| API style | gRPC streaming + REST | WebSocket streaming |
| SDKs | Python, Node.js, Go, Java, C#, Ruby, PHP | Python, Node.js |
Excellent quality but too slow for real-time voice agents. Best suited for batch transcription or applications where latency isn't critical.
Poor quality and poor latency for Arabic. Not recommended for any Arabic STT use case.
ElevenLabs Scribe v2 is faster with an average end-of-utterance delay of 2000ms–2500ms, which is 376ms faster than Google Cloud STT — Chirp 3.
Google Cloud STT — Chirp 3 has a quality rating of 5/5 (Excellent). High quality transcription. Broad Arabic dialect support through ar-XA language code.
Both providers are viable options. Google Cloud STT — Chirp 3: Excellent quality but too slow for real-time voice agents. Best suited for batch transcription or applications where latency isn't critical. ElevenLabs Scribe v2: Poor quality and poor latency for Arabic. Not recommended for any Arabic STT use case.
Google Cloud STT — Chirp 3 starts at $0.016 per 15 seconds (Chirp 3 model). ElevenLabs Scribe v2 starts at $5 per month (Includes STT credits).