AI Tool Radar · LLM APIs

Groq Cloud

Published by Groq. We open the pricing page daily and record what we see — tiers, prices, and feature lists — verbatim.

Category
LLM APIs
Tiers tracked
14
Last captured
9 days ago
Movements logged
1

Vendor groq.comPricing groq.com/pricing

Current pricing

Tiers, as published.

The amounts and tier names are pulled verbatim from Groq Cloud’s pricing page. Custom means the vendor quotes on request.

Captured

  • GPT OSS 20B 128kCustom
  • GPT OSS Safeguard 20BCustom
  • GPT OSS 120B 128kCustom
  • Llama 4 Scout (17Bx16E) 128kCustom
  • Qwen3 32B 131kCustom
  • Llama 3.3 70B Versatile 128kCustom
  • Llama 3.1 8B Instant 128kCustom
  • Minimax M2.5 (Enterprise)Custom
  • Qwen3-VL 32B (Enterprise)Custom
  • Canopy Labs Orpheus English (TTS)Custom
  • Canopy Labs Orpheus Arabic Saudi (TTS)Custom
  • Whisper V3 Large (ASR)Custom
  • Whisper Large v3 Turbo (ASR)Custom
  • Batch APICustom

Features

37 lines

What each tier advertises.

GPT OSS 20B 128k

  • $0.075 per 1M input tokens
  • $0.30 per 1M output tokens
  • 1,000 TPS

GPT OSS Safeguard 20B

  • $0.075 per 1M input tokens
  • $0.30 per 1M output tokens
  • 1,000 TPS

GPT OSS 120B 128k

  • $0.15 per 1M input tokens
  • $0.60 per 1M output tokens
  • 500 TPS

Llama 4 Scout (17Bx16E) 128k

  • $0.11 per 1M input tokens
  • $0.34 per 1M output tokens
  • 594 TPS

Qwen3 32B 131k

  • $0.29 per 1M input tokens
  • $0.59 per 1M output tokens
  • 662 TPS

Llama 3.3 70B Versatile 128k

  • $0.59 per 1M input tokens
  • $0.79 per 1M output tokens
  • 394 TPS

Llama 3.1 8B Instant 128k

  • $0.05 per 1M input tokens
  • $0.08 per 1M output tokens
  • 840 TPS

Minimax M2.5 (Enterprise)

  • Enterprise-only
  • Contact sales

Qwen3-VL 32B (Enterprise)

  • Enterprise-only
  • Contact sales

Canopy Labs Orpheus English (TTS)

  • $22.00 per 1M characters
  • 100 characters/s

Canopy Labs Orpheus Arabic Saudi (TTS)

  • $40.00 per 1M characters
  • 100 characters/s

Whisper V3 Large (ASR)

  • $0.111 per hour transcribed
  • 217x speed factor

Whisper Large v3 Turbo (ASR)

  • $0.04 per hour transcribed
  • 228x speed factor

Batch API

  • 50% lower cost than standard
  • Async processing
  • 24-hour to 7 day window
  • No impact to standard rate limits

Change history

1 movement

Everything we’ve seen move.

  • Newly tracked

← Back to the radar