AI Tool Radar · LLM APIs
Groq Cloud
Published by Groq. We open the pricing page daily and record what we see — tiers, prices, and feature lists — verbatim.
- Category
- LLM APIs
- Tiers tracked
- 14
- Last captured
- 9 days ago
- Movements logged
- 1
Vendor groq.comPricing groq.com/pricing
Current pricing
Tiers, as published.
The amounts and tier names are pulled verbatim from Groq Cloud’s pricing page. Custom means the vendor quotes on request.
Captured
- GPT OSS 20B 128kCustom
- GPT OSS Safeguard 20BCustom
- GPT OSS 120B 128kCustom
- Llama 4 Scout (17Bx16E) 128kCustom
- Qwen3 32B 131kCustom
- Llama 3.3 70B Versatile 128kCustom
- Llama 3.1 8B Instant 128kCustom
- Minimax M2.5 (Enterprise)Custom
- Qwen3-VL 32B (Enterprise)Custom
- Canopy Labs Orpheus English (TTS)Custom
- Canopy Labs Orpheus Arabic Saudi (TTS)Custom
- Whisper V3 Large (ASR)Custom
- Whisper Large v3 Turbo (ASR)Custom
- Batch APICustom
Features
37 lines
What each tier advertises.
GPT OSS 20B 128k
- $0.075 per 1M input tokens
- $0.30 per 1M output tokens
- 1,000 TPS
GPT OSS Safeguard 20B
- $0.075 per 1M input tokens
- $0.30 per 1M output tokens
- 1,000 TPS
GPT OSS 120B 128k
- $0.15 per 1M input tokens
- $0.60 per 1M output tokens
- 500 TPS
Llama 4 Scout (17Bx16E) 128k
- $0.11 per 1M input tokens
- $0.34 per 1M output tokens
- 594 TPS
Qwen3 32B 131k
- $0.29 per 1M input tokens
- $0.59 per 1M output tokens
- 662 TPS
Llama 3.3 70B Versatile 128k
- $0.59 per 1M input tokens
- $0.79 per 1M output tokens
- 394 TPS
Llama 3.1 8B Instant 128k
- $0.05 per 1M input tokens
- $0.08 per 1M output tokens
- 840 TPS
Minimax M2.5 (Enterprise)
- Enterprise-only
- Contact sales
Qwen3-VL 32B (Enterprise)
- Enterprise-only
- Contact sales
Canopy Labs Orpheus English (TTS)
- $22.00 per 1M characters
- 100 characters/s
Canopy Labs Orpheus Arabic Saudi (TTS)
- $40.00 per 1M characters
- 100 characters/s
Whisper V3 Large (ASR)
- $0.111 per hour transcribed
- 217x speed factor
Whisper Large v3 Turbo (ASR)
- $0.04 per hour transcribed
- 228x speed factor
Batch API
- 50% lower cost than standard
- Async processing
- 24-hour to 7 day window
- No impact to standard rate limits