AI API PricingCost per Token 2026

How much does it cost to use GPT, Claude, Gemini and other LLM APIs? Compare pricing per million tokens — input and output rates for every major model.

Last updated: June 01, 2026 626 APIs listed

626

APIs listed

139

With free tier

Free

Cheapest (input/1M)

$150.00

Most expensive (input/1M)

How to read the table: prices are per million tokens (input = what you send; output = the model's response). In English, 1,000 tokens ≈ 750 words ≈ 1 A4 page. Prices verified on each company's official pricing page.

Price per Million Tokens — Paid APIs

#ModelInput USD/1M
1Jamba 1.7 MiniFree
2Jamba Reasoning 3BFree
3Qwen Chat 14BFree
4Qwen Chat 72BFree
5Qwen1.5 Chat 110BFree
6Qwen2 Instruct 72BFree
7Qwen2.5 Coder 32B InstructFree
8Qwen2.5 Coder Instruct 7B Free
9Qwen2.5 Instruct 32BFree
10Qwen3 4B 2507 (Reasoning)Free
11Qwen3 4B 2507 InstructFree
12Qwen3 VL 4B (Reasoning)Free
13Qwen3 VL 4B InstructFree
14Qwen3.5 9B (Reasoning)Free
15QwQ 32B-PreviewFree
16Llama 3.1 Tulu3 405BFree
17Molmo 7B-DFree
18Molmo2-8BFree
19OLMo 2 32BFree
20OLMo 2 7BFree
21Olmo 3 7B ThinkFree
22Olmo 3.1 32B ThinkFree
23Olmo 3 32B ThinkFree
24Olmo 3.1 32B InstructFree
25Claude 2.0Free
26Claude 2.1Free
27Claude 3.7 Sonnet (thinking)Free
28Claude InstantFree
29ERNIE 5.0 Thinking PreviewFree
30Doubao Seed CodeFree
31Doubao Seed CodeFree
32JT-35B-FlashFree
33JT-35B-FlashFree
34JT-MINIFree
35Command A+Free
36Tiny Aya GlobalFree
37DBRX InstructFree
38DeepSeek Coder V2 Lite InstructFree
39DeepSeek LLM 67B Chat (V1)Free
40DeepSeek R1 0528 Qwen3 8BFree
41DeepSeek R1 Distill Llama 8BFree
42DeepSeek R1 Distill Qwen 1.5BFree
43DeepSeek R1 Distill Qwen 14BFree
44DeepSeek V3.2 SpecialeFree
45DeepSeek-Coder-V2Free
46DeepSeek-V2-ChatFree
47DeepSeek-V2.5Free
48DeepSeek-V2.5 (Dec '24)Free
49DeepSeek: R1 Distill Qwen 32BFree
50Gemini 1.0 ProFree
51Gemini 1.0 UltraFree
52Gemini 1.5 Flash (May '24)Free
53Gemini 1.5 Flash (Sep '24)Free
54Gemini 1.5 Flash-8BFree
55Gemini 1.5 Pro (May '24)Free
56Gemini 1.5 Pro (Sep '24)Free
57Gemini 2.0 Flash (experimental)Free
58Gemini 2.0 Flash Thinking Experimental (Dec '24)Free
59Gemini 2.0 Flash Thinking Experimental (Jan '25)Free
60Gemini 2.0 Flash-Lite (Feb '25)Free
61Gemini 2.0 Flash-Lite (Preview)Free
62Gemini 2.0 Pro Experimental (Feb '25)Free
63Gemini 2.5 Flash Preview (Non-reasoning)Free
64Gemini 2.5 Flash Preview (Sep '25) (Reasoning)Free
65Gemini 2.5 Pro Preview (Mar' 25)Free
66Gemini 3 Deep ThinkFree
67Gemma 3 1B InstructFree
68Gemma 3 270MFree
69Gemma 3n E2B InstructFree
70Gemma 3n E4B Instruct Preview (May '25)Free
71Gemma 4 E2B (Non-reasoning)Free
72Gemma 4 E2B (Reasoning)Free
73Gemma 4 E4B (Non-reasoning)Free
74Gemma 4 E4B (Reasoning)Free
75PALM-2Free
76Granite 4.0 1BFree
77Granite 4.0 350MFree
78Granite 4.0 H 1BFree
79Granite 4.0 H 350MFree
80Granite 4.0 MicroFree
81Granite 4.1 30BFree
82Granite 4.1 3BFree
83Ling-1TFree
84Ling-mini-2.0Free
85Ring-1TFree
86Kimi Linear 48B A3B InstructFree
87Mi:dm K 2.5 ProFree
88Mi:dm K 2.5 Pro PreviewFree
89EXAONE 4.5 33BFree
90K-EXAONE (Reasoning)Free
91Exaone 4.0 1.2B (Non-reasoning)Free
92EXAONE 4.0 32B (Non-reasoning)Free
93EXAONE 4.0 32B (Reasoning)Free
94LFM 40BFree
95LFM2 1.2BFree
96LFM2 2.6BFree
97LFM2 8B A1BFree
98LFM2.5-1.2B-InstructFree
99LFM2.5-1.2B-ThinkingFree
100LFM2.5-VL-1.6BFree
101LongCat Flash LiteFree
102K2 Think V2Free
103K2-V2 (high)Free
104K2-V2 (medium)Free
105Llama 2 Chat 13BFree
106Llama 2 Chat 70BFree
107Llama 65BFree
108Muse SparkFree
109Phi-3 Mini Instruct 3.8BFree
110Phi-4 Mini InstructFree
111Phi-4 Multimodal InstructFree
112MiniMax M1 40kFree
113Devstral 2Free
114Devstral Small (May '25)Free
115Magistral Medium 1Free
116Magistral Small 1Free
117Magistral Small 1.2Free
118Mixtral 8x22B InstructFree
119Magistral Medium 1.2Free
120Mistral: SabaFree
121Motif-2-12.7B-ReasoningFree
122Nanbeige4.1-3BFree
123HyperCLOVA X SEED Think (32B)Free
124DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)Free
125DeepHermes 3 - Mistral 24B Preview (Non-reasoning)Free
126Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)Free
127Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)Free
128Llama 3.3 Nemotron Super 49B v1 (Reasoning)Free
129Nemotron Cascade 2 30B A3BFree
130NVIDIA Nemotron 3 Nano 4BFree
131GPT-3.5 Turbo (0613)Free
132GPT-4.5 (Preview)Free
133GPT-4o (ChatGPT)Free
134GPT-4o (March 2025, chatgpt-4o-latest)Free
135GPT-4o mini Realtime (Dec '24)Free
136GPT-4o Realtime (Dec '24)Free
137GPT-5.5 ProFree
138o1-miniFree
139MiniCPM-V 4.6 1.3BFree
140Qwen3.5 0.8B (Non-reasoning)$0.010
141Qwen3.5 0.8B (Reasoning)$0.010
142Qwen3.5 2B (Reasoning)$0.020
143Gemma 3n E4B Instruct$0.020
144Mistral: Mistral Nemo$0.020
145Qwen3.5 4B (Non-reasoning)$0.030
146Qwen3.5 4B (Reasoning)$0.030
147Granite 3.3 8B (Non-reasoning)$0.030
148LFM2-24B-A2B$0.030
149Amazon: Nova Micro 1.0$0.035
150Nova Micro$0.035
151Cohere: Command R7B (12-2024)$0.037
152Qwen: Qwen2.5 7B Instruct$0.040
153Gemma 3 4B$0.040
154NVIDIA Nemotron Nano 9B V2 (Reasoning)$0.040
155Arcee AI: Trinity Mini$0.045
156Llama 3 8B Instruct$0.045
157Granite 4.1 8B$0.050
158Llama 2 Chat 7B$0.050
159Llama 3.2 1B Instruct$0.050
160NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)$0.050
161NVIDIA Nemotron Nano 9B V2 (Non-reasoning)$0.050
162GPT-5 Nano$0.050
163GPT-5 nano (minimal)$0.050
164gpt-oss-20b$0.050
165NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)$0.055
166Amazon: Nova Lite 1.0$0.060
167Nova Lite$0.060
168Gemma 3n 4B$0.060
169Granite 4.0 H Small$0.060
170MythoMax 13B$0.060
171Baidu: ERNIE 4.5 21B A3B Thinking$0.070
172ByteDance Seed: Seed 1.6 Flash$0.075
173Gemini 2.0 Flash Lite$0.075
174Mistral Small 3$0.075
175Mistral: Mistral Small 3.2 24B$0.075
176Nemotron 3 Nano Omni 30B A3B Reasoning$0.075
177gpt-oss-safeguard-20b$0.075
178Qwen: Qwen3 30B A3B Instruct 2507$0.080
179Qwen: Qwen3 30B A3B Thinking 2507$0.080
180Mistral Small 3.2$0.087
181Qwen3 30B A3B (Reasoning)$0.090
182Gemma 3 12B$0.090
183Qwen3.5 Omni Flash$0.100
184Olmo 3 7B Instruct$0.100
185ByteDance: UI-TARS 7B $0.100
186Gemini 2.5 Flash Lite$0.100
187Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)$0.100
188Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)$0.100
189Ling 2.6 Flash$0.100
190Llama 3.1 8B Instruct$0.100
191Devstral Small (Jul '25)$0.100
192Devstral Small 2$0.100
193Ministral 3 3B$0.100
194Mistral: Devstral Small 1.1$0.100
195Mistral: Ministral 3 3B 2512$0.100
196Mistral: Mistral Small Creative$0.100
197Mistral: Voxtral Small 24B 2507$0.100
198Llama Nemotron Super 49B v1.5 (Non-reasoning)$0.100
199Llama Nemotron Super 49B v1.5 (Reasoning)$0.100
200GPT-4.1 Nano$0.100
201Mistral Small 3.1$0.105
202Qwen3 0.6B (Non-reasoning)$0.110
203Qwen3 0.6B (Reasoning)$0.110
204Qwen3 1.7B (Non-reasoning)$0.110
205Qwen3 1.7B (Reasoning)$0.110
206Qwen3 4B (Non-reasoning)$0.110
207Qwen3 4B (Reasoning)$0.110
208Qwen3 8B (Reasoning)$0.110
209Gemma 3 27B$0.110
210Mistral: Mistral 7B Instruct v0.1$0.110
211Microsoft: Phi 4$0.125
212Gemma 4 26B A4B $0.130
213Nous: Hermes 4 70B$0.130
214Hermes 4 - Llama-3.1 70B (Non-reasoning)$0.130
215Hermes 4 - Llama-3.1 70B (Reasoning)$0.130
216Nex AGI: DeepSeek V3.1 Nex N1$0.135
217Baidu: ERNIE 4.5 VL 28B A3B$0.140
218DeepSeek V4 Flash$0.140
219Gemma 4 31B$0.140
220Ling-flash-2.0$0.140
221Ring-flash-2.0$0.140
222NousResearch: Hermes 2 Pro - Llama-3 8B$0.140
223Qwen: Qwen3 235B A22B Thinking 2507$0.149
224Qwen3 30B A3B 2507 Instruct$0.150
225Qwen3 32B (Non-reasoning)$0.150
226Qwen3 32B (Reasoning)$0.150
227EssentialAI: Rnj 1 Instruct$0.150
228Gemini 2.0 Flash$0.150
229Llama 3.2 3B Instruct$0.150
230Ministral 3 8B$0.150
231Mistral: Ministral 3 8B 2512$0.150
232GPT-4o-mini (2024-07-18)$0.150
233GPT-4o-mini Search Preview$0.150
234gpt-oss-120b$0.150
235OpenAI: GPT-4o-mini$0.150
236Llama 4 Scout$0.170
237Qwen: Qwen3 VL 8B Instruct$0.180
238Qwen3 8B (Non-reasoning)$0.180
239Qwen3 VL 8B (Reasoning)$0.180
240Arcee AI: Spotlight$0.180
241Llama Guard 4 12B$0.180
242Qwen: Qwen3 Coder 30B A3B Instruct$0.190
243Jamba 1.5 Mini$0.200
244Jamba 1.6 Mini$0.200
245Qwen: Qwen3 VL 30B A3B Instruct$0.200
246Qwen3 VL 30B A3B (Reasoning)$0.200
247MiniMax: MiniMax-01$0.200
248Ministral 3 14B$0.200
249Mistral 7B Instruct$0.200
250Mistral Small (Sep '24)$0.200
251Mistral: Ministral 3 14B 2512$0.200
252Mistral: Mistral Small 4$0.200
253NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)$0.200
254NVIDIA Nemotron Nano 12B v2 VL (Reasoning)$0.200
255GPT-5.4 Nano$0.200
256Seed-OSS-36B-Instruct$0.210
257Arcee AI: Trinity Large Thinking$0.220
258DeepSeek V3$0.229
259Qwen3 14B (Non-reasoning)$0.235
260Qwen3 14B (Reasoning)$0.235
261Trinity Large Thinking$0.235
262Llama 3.2 11B Vision Instruct$0.245
263Qwen: Qwen2.5 VL 72B Instruct$0.250
264Qwen3 Omni 30B A3B (Reasoning)$0.250
265Qwen3 Omni 30B A3B Instruct$0.250
266Anthropic: Claude 3 Haiku$0.250
267ByteDance Seed: Seed-2.0-Lite$0.250
268Gemini 3.1 Flash Lite$0.250
269Gemini 3.1 Flash Lite Preview$0.250
270Inception: Mercury 2$0.250
271GPT-5 Mini$0.250
272GPT-5 mini (minimal)$0.250
273GPT-5.1-Codex-Mini$0.250
274DeepSeek V3.1 Terminus$0.270
275DeepSeek V3.2 Exp$0.270
276DeepSeek V3.2 Exp (Non-reasoning)$0.275
277DeepSeek V3.2 Exp (Reasoning)$0.275
278Qwen3 30B A3B 2507 (Reasoning)$0.280
279Baidu: ERNIE 4.5 300B A47B $0.280
280Qwen: Qwen3 VL 235B A22B Instruct$0.300
281Qwen3 Coder 480B A35B Instruct$0.300
282Amazon: Nova 2 Lite$0.300
283Nova 2.0 Lite (high)$0.300
284Nova 2.0 Omni (low)$0.300
285Nova 2.0 Omni (medium)$0.300
286Nova 2.0 Omni (Non-reasoning)$0.300
287Gemini 2.5 Flash$0.300
288Gemini 2.5 Flash Preview (Reasoning)$0.300
289Nano Banana (Gemini 2.5 Flash Image)$0.300
290Ling-2.6-1T$0.300
291Ring-2.6-1T$0.300
292KAT-Coder-Pro V1$0.300
293Kwaipilot: KAT-Coder-Pro V2$0.300
294MiniMax-M2$0.300
295MiniMax: MiniMax M2-her$0.300
296MiniMax: MiniMax M2.1$0.300
297MiniMax: MiniMax M2.5$0.300
298MiniMax: MiniMax M2.7$0.300
299Mistral: Codestral 2508$0.300
300Nous: Hermes 3 70B Instruct$0.300
301Hermes 3 - Llama-3.1 70B$0.300
302NVIDIA Nemotron 3 Super 120B A12B (Reasoning)$0.300
303Llama 4 Maverick$0.350
304Mistral: Mistral Small 3.1 24B$0.350
305Qwen2.5 72B Instruct$0.360
306Qwen: Qwen3 235B A22B Instruct 2507$0.400
307Qwen3.5 Omni Plus$0.400
308MiniMax: MiniMax M1$0.400
309Mistral: Devstral 2 2512$0.400
310Mistral: Devstral Medium$0.400
311Mistral: Mistral Medium 3$0.400
312Mistral: Mistral Medium 3.1$0.400
313GPT-4.1 Mini$0.400
314Baidu: ERNIE 4.5 VL 424B A47B $0.420
315DeepSeek V4 Pro$0.435
316Mistral: Mixtral 8x7B Instruct$0.450
317Llama Guard 3 8B$0.480
318Qwen: Qwen3 Next 80B A3B Instruct$0.500
319Qwen3 Next 80B A3B (Reasoning)$0.500
320Arcee AI: Coder Large$0.500
321Command-R (Mar '24)$0.500
322DeepSeek V3.2$0.500
323Gemini 3 Flash Preview$0.500
324Gemini 3 Flash Preview (Non-reasoning)$0.500
325Gemini 3 Flash Preview (Reasoning)$0.500
326Nano Banana 2 (Gemini 3.1 Flash Image Preview)$0.500
327GPT-3.5 Turbo$0.500
328GPT-3.5 Turbo$0.500
329MiniMax M1 80k$0.550
330Llama 3.1 70B Instruct$0.560
331MoonshotAI: Kimi K2 0711$0.570
332Llama 3.3 70B Instruct$0.580
333MoonshotAI: Kimi K2.5$0.580
334Kimi K2$0.585
335DeepSeek V3.1$0.590
336Kimi K2 Thinking$0.600
337MoonshotAI: Kimi K2 0905$0.600
338Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)$0.600
339GPT Audio Mini$0.600
340WizardLM-2 8x22B$0.620
341Gemma 2 27B$0.650
342Llama 3 70B Instruct$0.650
343QwQ 32B$0.660
344Qwen: Qwen3 VL 32B Instruct$0.700
345Qwen3 235B A22B (Reasoning)$0.700
346Qwen3 VL 32B (Reasoning)$0.700
347DeepSeek: R1$0.700
348R1 Distill Llama 70B$0.700
349Arcee AI: Virtuoso Large$0.750
350Mancer: Weaver (alpha)$0.750
351GPT-5.4 Mini$0.750
352AionLabs: Aion-2.0$0.800
353AionLabs: Aion-RP 1.0 (8B)$0.800
354AlfredPros: CodeLLaMa 7B Instruct Solidity$0.800
355Amazon: Nova Pro 1.0$0.800
356Nova Pro$0.800
357Morph: Morph V3 Fast$0.800
358Qwen3 VL 235B A22B (Reasoning)$0.840
359Arcee AI: Maestro Reasoning$0.900
360Morph: Morph V3 Large$0.900
361MoonshotAI: Kimi K2.6$0.950
362Claude 3.5 Haiku$1.00
363Mistral Small (Feb '24)$1.00
364Nous: Hermes 3 405B Instruct$1.00
365Nous: Hermes 4 405B$1.00
366Hermes 4 - Llama-3.1 405B (Non-reasoning)$1.00
367Hermes 4 - Llama-3.1 405B (Reasoning)$1.00
368o3 Mini$1.10
369o3 Mini High$1.10
370o4 Mini$1.10
371o4 Mini High$1.10
372DeepSeek V3 0324$1.20
373Qwen3 Max (Preview)$1.20
374Qwen3 Max Thinking (Preview)$1.20
375Llama 3.1 Nemotron 70B Instruct$1.20
376Nova 2.0 Pro Preview (medium)$1.25
377Claude 4.5 Haiku (Reasoning)$1.25
378Claude Haiku 4.5$1.25
379Cogito v2.1 (Reasoning)$1.25
380Deep Cogito: Cogito v2.1 671B$1.25
381Gemini 2.5 Pro$1.25
382Gemini 2.5 Pro Preview (May' 25)$1.25
383Gemini 2.5 Pro Preview 05-06$1.25
384Gemini 2.5 Pro Preview 06-05$1.25
385GPT-5$1.25
386GPT-5 (ChatGPT)$1.25
387GPT-5 (minimal)$1.25
388GPT-5 Chat$1.25
389GPT-5 Codex$1.25
390GPT-5.1$1.25
391GPT-5.1 Chat$1.25
392GPT-5.1-Codex$1.25
393GPT-5.1-Codex-Max$1.25
394Qwen3.6 Max Preview$1.30
395Llama 3.2 Instruct 90B (Vision)$1.38
396Gemini 3.5 Flash (minimal)$1.50
397Google: Gemini 3.5 Flash$1.50
398Mistral: Mistral Medium 3.5$1.50
399Qwen2.5 Max$1.60
400DeepSeek R1 (Jan '25)$1.68
401GPT-5.2$1.75
402GPT-5.2 Chat$1.75
403GPT-5.2-Codex$1.75
404GPT-5.3 Chat$1.75
405GPT-5.3-Codex$1.75
406AI21: Jamba Large 1.7$2.00
407Jamba 1.5 Large$2.00
408Jamba 1.6 Large$2.00
409Gemini 3 Pro Preview (high)$2.00
410Gemini 3 Pro Preview (low)$2.00
411Gemini 3.1 Pro Preview$2.00
412Gemini 3.1 Pro Preview Custom Tools$2.00
413Nano Banana Pro (Gemini 3 Pro Image Preview)$2.00
414Mistral Large 2 (Jul '24)$2.00
415Mistral Large 2 (Nov '24)$2.00
416Mistral Large$2.00
417Mistral: Mixtral 8x22B Instruct$2.00
418Mistral: Pixtral Large 2411$2.00
419GPT-4.1$2.00
420o3$2.00
421o4 Mini Deep Research$2.00
422Qwen3.7 Max$2.50
423Amazon: Nova Premier 1.0$2.50
424Cohere: Command R+ (08-2024)$2.50
425Inflection: Inflection 3 Pi$2.50
426Inflection: Inflection 3 Productivity$2.50
427GPT Audio$2.50
428GPT-4o (2024-08-06)$2.50
429GPT-4o (2024-11-20)$2.50
430GPT-4o Audio$2.50
431GPT-4o Search Preview$2.50
432GPT-5 Image Mini$2.50
433GPT-5.4$2.50
434OpenAI: GPT-4o$2.50
435Llama 3.1 Instruct 405B$2.75
436Mistral Medium$2.75
437Claude 3 Sonnet$3.00
438Claude Sonnet 4.5$3.00
439Command-R+ (Apr '24)$3.00
440Magnum v4 72B$3.00
441OpenAI: GPT-3.5 Turbo 16k$3.00
442Claude 3.5 Sonnet (June '24)$3.75
443Claude 3.5 Sonnet (Oct '24)$3.75
444Claude 3.7 Sonnet$3.75
445Claude 4 Sonnet (Reasoning)$3.75
446Claude 4.5 Sonnet (Non-reasoning)$3.75
447Claude 4.5 Sonnet (Reasoning)$3.75
448Claude Sonnet 4$3.75
449Claude Sonnet 4.6$3.75
450Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)$3.75
451Claude Sonnet 4.6 (Non-reasoning, Low Effort)$3.75
452Goliath 120B$3.75
453AionLabs: Aion-1.0$4.00
454Mistral Large 3$4.00
455GPT Chat Latest$5.00
456GPT-5.5$5.00
457GPT-5.5 Instant (May 2026)$5.00
458OpenAI: GPT-4o (2024-05-13)$5.00
459Claude Opus 4.5$6.25
460Claude Opus 4.5 (Reasoning)$6.25
461Claude Opus 4.6$6.25
462Claude Opus 4.6 (Adaptive Reasoning, Max Effort)$6.25
463Claude Opus 4.7$6.25
464Claude Opus 4.8 (Adaptive Reasoning, Max Effort)$6.25
465GPT-5.4 Image 2$8.00
466Anthropic: Claude Opus 4.8 (Fast)$10.00
467GPT-4 Turbo$10.00
468GPT-4 Turbo Preview$10.00
469GPT-5 Image$10.00
470o3 Deep Research$10.00
471OpenAI: GPT-4 Turbo (older v1106)$10.00
472Claude Opus 4.1$15.00
473GPT-5 Pro$15.00
474o1$15.00
475o1-preview$16.50
476Claude 3 Opus$18.75
477Claude 4 Opus (Reasoning)$18.75
478Claude 4.1 Opus (Non-reasoning)$18.75
479Claude 4.1 Opus (Reasoning)$18.75
480Claude Opus 4$18.75
481o3 Pro$20.00
482GPT-5.2 Pro$21.00
483Claude Opus 4.6 (Fast)$30.00
484Claude Opus 4.7 (Fast)$30.00
485GPT-5.4 Pro$30.00
486OpenAI: GPT-4$30.00
487o1-pro$150.00

APIs with Free Tier

These models offer free API access (with rate limits). Ideal for prototypes and low-volume projects.

Claude 2.0

Anthropic

Free

Claude 2.1

Anthropic

Free

Claude 3.7 Sonnet (thinking)

Anthropic

Free

Claude Instant

Anthropic

Free

Command A+

Cohere

Free

DBRX Instruct

Databricks

Free

DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)

Nous Research

Free

DeepHermes 3 - Mistral 24B Preview (Non-reasoning)

Nous Research

Free

DeepSeek Coder V2 Lite Instruct

DeepSeek

Free

DeepSeek LLM 67B Chat (V1)

DeepSeek

Free

DeepSeek R1 0528 Qwen3 8B

DeepSeek

Free

DeepSeek R1 Distill Llama 8B

DeepSeek

Free

DeepSeek R1 Distill Qwen 1.5B

DeepSeek

Free

DeepSeek R1 Distill Qwen 14B

DeepSeek

Free

DeepSeek V3.2 Speciale

DeepSeek

Free

DeepSeek-Coder-V2

DeepSeek

Free

DeepSeek-V2-Chat

DeepSeek

Free

DeepSeek-V2.5

DeepSeek

Free

DeepSeek-V2.5 (Dec '24)

DeepSeek

Free

DeepSeek: R1 Distill Qwen 32B

DeepSeek

Free

Devstral 2

Mistral

Free

Devstral Small (May '25)

Mistral

Free

Doubao Seed Code

ByteDance

Free

Doubao Seed Code

ByteDance Seed

Free

ERNIE 5.0 Thinking Preview

Baidu

Free

Exaone 4.0 1.2B (Non-reasoning)

LG AI Research

Free

EXAONE 4.0 32B (Non-reasoning)

LG AI Research

Free

EXAONE 4.0 32B (Reasoning)

LG AI Research

Free

EXAONE 4.5 33B

LG AI

Free

Gemini 1.0 Pro

Google

Free

Gemini 1.0 Ultra

Google

Free

Gemini 1.5 Flash (May '24)

Google

Free

Gemini 1.5 Flash (Sep '24)

Google

Free

Gemini 1.5 Flash-8B

Google

Free

Gemini 1.5 Pro (May '24)

Google

Free

Gemini 1.5 Pro (Sep '24)

Google

Free

Gemini 2.0 Flash (experimental)

Google

Free

Gemini 2.0 Flash Thinking Experimental (Dec '24)

Google

Free

Gemini 2.0 Flash Thinking Experimental (Jan '25)

Google

Free

Gemini 2.0 Flash-Lite (Feb '25)

Google

Free

Gemini 2.0 Flash-Lite (Preview)

Google

Free

Gemini 2.0 Pro Experimental (Feb '25)

Google

Free

Gemini 2.5 Flash Preview (Non-reasoning)

Google

Free

Gemini 2.5 Flash Preview (Sep '25) (Reasoning)

Google

Free

Gemini 2.5 Pro Preview (Mar' 25)

Google

Free

Gemini 3 Deep Think

Google

Free

Gemma 3 1B Instruct

Google

Free

Gemma 3 270M

Google

Free

Gemma 3n E2B Instruct

Google

Free

Gemma 3n E4B Instruct Preview (May '25)

Google

Free

Gemma 4 E2B (Non-reasoning)

Google

Free

Gemma 4 E2B (Reasoning)

Google

Free

Gemma 4 E4B (Non-reasoning)

Google

Free

Gemma 4 E4B (Reasoning)

Google

Free

GPT-3.5 Turbo (0613)

OpenAI

Free

GPT-4.5 (Preview)

OpenAI

Free

GPT-4o (ChatGPT)

OpenAI

Free

GPT-4o (March 2025, chatgpt-4o-latest)

OpenAI

Free

GPT-4o mini Realtime (Dec '24)

OpenAI

Free

GPT-4o Realtime (Dec '24)

OpenAI

Free

GPT-5.5 Pro

OpenAI

Free

Granite 4.0 1B

IBM

Free

Granite 4.0 350M

IBM

Free

Granite 4.0 H 1B

IBM

Free

Granite 4.0 H 350M

IBM

Free

Granite 4.0 Micro

IBM

Free

Granite 4.1 30B

IBM

Free

Granite 4.1 3B

IBM

Free

HyperCLOVA X SEED Think (32B)

Naver

Free

Jamba 1.7 Mini

AI21 Labs

Free

Jamba Reasoning 3B

AI21 Labs

Free

JT-35B-Flash

China Mobile

Free

JT-35B-Flash

China Mobile

Free

JT-MINI

China Mobile

Free

K-EXAONE (Reasoning)

LG AI

Free

K2 Think V2

MBZUAI Institute of Foundation Models

Free

K2-V2 (high)

MBZUAI Institute of Foundation Models

Free

K2-V2 (medium)

MBZUAI Institute of Foundation Models

Free

Kimi Linear 48B A3B Instruct

Kimi

Free

LFM 40B

Liquid AI

Free

LFM2 1.2B

Liquid AI

Free

LFM2 2.6B

Liquid AI

Free

LFM2 8B A1B

Liquid AI

Free

LFM2.5-1.2B-Instruct

Liquid AI

Free

LFM2.5-1.2B-Thinking

Liquid AI

Free

LFM2.5-VL-1.6B

Liquid AI

Free

Ling-1T

InclusionAI

Free

Ling-mini-2.0

InclusionAI

Free

Llama 2 Chat 13B

Meta

Free

Llama 2 Chat 70B

Meta

Free

Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)

NVIDIA

Free

Llama 3.1 Tulu3 405B

Allen Institute for AI

Free

Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)

NVIDIA

Free

Llama 3.3 Nemotron Super 49B v1 (Reasoning)

NVIDIA

Free

Llama 65B

Meta

Free

LongCat Flash Lite

LongCat

Free

Magistral Medium 1

Mistral

Free

Magistral Medium 1.2

Mistral AI

Free

Magistral Small 1

Mistral

Free

Magistral Small 1.2

Mistral

Free

Mi:dm K 2.5 Pro

Korea Telecom

Free

Mi:dm K 2.5 Pro Preview

Korea Telecom

Free

MiniCPM-V 4.6 1.3B

OpenBMB

Free

MiniMax M1 40k

MiniMax

Free

Mistral: Saba

Mistral AI

Free

Mixtral 8x22B Instruct

Mistral

Free

Molmo 7B-D

Allen Institute for AI

Free

Molmo2-8B

Allen Institute for AI

Free

Motif-2-12.7B-Reasoning

Motif Technologies

Free

Muse Spark

Meta

Free

Nanbeige4.1-3B

Nanbeige

Free

Nemotron Cascade 2 30B A3B

NVIDIA

Free

NVIDIA Nemotron 3 Nano 4B

NVIDIA

Free

o1-mini

OpenAI

Free

OLMo 2 32B

Allen Institute for AI

Free

OLMo 2 7B

Allen Institute for AI

Free

Olmo 3 32B Think

AllenAI

Free

Olmo 3 7B Think

Allen Institute for AI

Free

Olmo 3.1 32B Instruct

AllenAI

Free

Olmo 3.1 32B Think

Allen Institute for AI

Free

PALM-2

Google

Free

Phi-3 Mini Instruct 3.8B

Microsoft

Free

Phi-4 Mini Instruct

Microsoft

Free

Phi-4 Multimodal Instruct

Microsoft

Free

Qwen Chat 14B

Alibaba

Free

Qwen Chat 72B

Alibaba

Free

Qwen1.5 Chat 110B

Alibaba

Free

Qwen2 Instruct 72B

Alibaba

Free

Qwen2.5 Coder 32B Instruct

Alibaba

Free

Qwen2.5 Coder Instruct 7B

Alibaba

Free

Qwen2.5 Instruct 32B

Alibaba

Free

Qwen3 4B 2507 (Reasoning)

Alibaba

Free

Qwen3 4B 2507 Instruct

Alibaba

Free

Qwen3 VL 4B (Reasoning)

Alibaba

Free

Qwen3 VL 4B Instruct

Alibaba

Free

Qwen3.5 9B (Reasoning)

Alibaba

Free

QwQ 32B-Preview

Alibaba

Free

Ring-1T

InclusionAI

Free

Tiny Aya Global

Cohere

Free

AI API Pricing Guide

How Token-Based Pricing Works

Most LLM APIs charge per token processed, split into two categories: input tokens (the text you send — your prompt, context, and history) and outputtokens (the model's generated response). Output pricing is typically 2–4× higher than input, as generation requires more compute.

In English, 1,000 tokens correspond to roughly 750 words. A full A4 page of text contains between 600 and 900 tokens.

Real-World Monthly Cost Example

Consider a company using the GPT-4o API to process 100 emails per day, with an average prompt of 800 tokens and a 300-token response. That's 110,000 tokens/day × 30 days = 3.3 million tokens/month. At $2.50/M input tokens and $10/M output:

  • Input: 2.4M tokens × $2.50/M = $6.00/mo
  • Output: 0.9M tokens × $10/M = $9.00/mo
  • Total: $15.00/mo

The same volume with Claude Haiku (~$0.25/M input) would cost only ~$1.73/mo — a significant saving when maximum quality isn't critical.

Strategies to Reduce API Costs

1. Pick the right model for each task: simple text classification can use Gemini Flash or Claude Haiku; reserve GPT-4o or Claude Opus for tasks that truly need advanced reasoning.

2. Compress your prompts: avoid repeating unnecessary context. Well-implemented RAG systems send only the relevant passages, not the entire document.

3. Cache responses: if the same prompt is sent repeatedly (e.g., product categorization), store results and reuse them. Providers like Anthropic offer prompt caching at a discount.

4. Use open-source models via third-party APIs: Groq, Together AI, and Fireworks serve models like Llama and Qwen at $0.01–$0.20/M tokens — 10–100× cheaper than proprietary frontier models.

Frequently Asked Questions about API Costs

How much does the GPT-4o API cost per token?

The GPT-4o API costs $2.50 per million input tokens and $10.00 per million output tokens (2026 pricing). For a business sending 1 million tokens per day, the monthly cost would be approximately $75 for input alone. Output tokens are 4× more expensive, so optimizing prompt length has a significant impact on cost.

What is the cheapest AI API available?

Open source models like Qwen, Llama and Gemma can be accessed via third-party APIs (Groq, Together AI, Fireworks) for fractions of a cent per million tokens — as low as $0.01–$0.10/M tokens. Among proprietary APIs, Gemini Flash and Claude Haiku are the most affordable at $0.08–$0.25/M input tokens.

What are tokens and how do I estimate my project cost?

Tokens are text units that LLMs process — in English, 1 token ≈ 4 characters. A standard A4 page has ~600–800 tokens. To estimate cost: (input tokens + output tokens) × price/1M tokens. Example: 500-token prompt + 300-token response = 800 tokens × model price.

Should I use the API or subscribe to ChatGPT Plus/Claude Pro?

For moderate personal use, a subscription ($20/month) is usually more economical. For heavy usage or product integration, the API is more flexible and scalable. The breakeven point typically occurs when your API token consumption exceeds the equivalent value of the monthly subscription.

How have AI API prices changed over time?

Prices have dropped dramatically: GPT-4 cost $30/M tokens in 2023; equivalent models now cost $2–5/M. The trend is continuous decline as competition increases. We update this table weekly — always verify official pricing before committing your budget.

Explore the Benchmark