FW

Fireworks AI

online

Blazing fast OSS inference. $1 free credits.

LLM
91
/ 100 APIVault Score

// At a glance

Free Tier
$1 credits · no card
Category
LLM
Credit Card
Not required
Last Verified
2m ago

// Free tier details

Available Models

Llama 3.3MixtralQwenDeepSeek

Monthly Requests

$1 free credits

No credit card needed
No phone verification

// Quick start

300">"text-purple-400">from openai 300">"text-purple-400">import OpenAI

client = OpenAI(
    api_key=300">"YOUR_FIREWORKS_KEY",
    base_url=300">"https://api.fireworks.ai/inference/v1",
)

response = client.chat.completions.create(
    model=300">"accounts/fireworks/models/llama-v3p3-70b-instruct",
    messages=[{300">"role": 300">"user", 300">"content": 300">"Hello."}],
)

print(response.choices[0].message.content)

// Overview

Optimized inference for Llama, Mixtral, and OSS models with sub-100ms time-to-first-token. Function calling and fine-tuning support.

// Pros

  • Very fast TTFT
  • Fine-tuning support
  • Function calling

// Cons

  • $1 free credit is small
  • Smaller model catalog

// Score breakdown

Reliability (35%) (from 2m ago health check)100/100
Free Tier Generosity (30%) (computed from quota, no-CC, no-phone fields)85/100
Documentation (20%) (human rating)87/100
Popularity (15%) (GitHub stars (log-normalised), or manual baseline)86/100

Methodology: apivault.dev/methodology

// Best for

Real-time agentsFunction callingOSS inference