// Methodology

How the APIVault Trust Score is calculated. What's automated, what's manual, and what we don't claim.

⚠ We value honesty over the appearance of sophistication. This page describes exactly what signals we collect and which ones come from a human rather than a machine.

The Formula

score = reliability   × 0.35   (automated when health checks have run)
      + free_tier     × 0.30   (rule-based, from factual provider fields)
      + documentation × 0.20   (MANUAL — human rating 1–5)
      + popularity    × 0.15   (automated when GitHub stars are fetched)

Dimension 1 — Reliability (35%)

Source: scripts/health-check.ts, run via GitHub Actions on a daily schedule.

For each provider we send a HEAD request (falling back to GET) to the public root of their docs site or API host — no API key required, so we never call a paid endpoint. We record:

  • HTTP status code (or null on timeout)
  • Round-trip response time in ms
  • Whether the host was reachable at all

Status classification: online if reachable and <1 500 ms, degraded if reachable but slow or returning an unexpected 4xx, down if unreachable or 5xx.

Score: percentage of successful checks in the last 30 days, with the most recent 7 days weighted 2× heavier than the prior 23.

If health checks have never run for a provider, the score falls back to the manual baseline in lib/providers.ts and the UI shows "not yet verified".

Dimension 2 — Free Tier Generosity (30%)

Source: factual fields in lib/providers.ts, verified manually.

Scoring rubric (max 100 points):

SignalPoints
No credit card required+30
No phone required+10
Monthly request or token quota documented+35
Rate limit documented+10
Available models listed+10
Both quota AND rate limit documented (completeness bonus)+5

This is rule-based, not machine-learned. We update the rubric when we discover new meaningful signals.

Dimension 3 — Documentation (20%)

Source: human. Stored as docManualScore (1–5) in lib/providers.ts.

Rating guide:

  • 5 — Dedicated docs site, quickstart, multi-language examples, interactive playground or Postman collection
  • 4 — Good docs site, quickstart, 2+ language examples
  • 3 — Adequate docs, at least curl examples and endpoint reference
  • 2 — Sparse docs, hard to find rate limits or model names
  • 1 — Minimal or machine-translated docs

We don't automate this dimension. Docs quality requires reading, not fetching. If a provider doesn't have a docManualScore set yet, the field falls back to the historical baseline.

Dimension 4 — Popularity (15%)

Source: scripts/fetch-popularity.ts, pulling GitHub star counts for providers with a githubRepo field.

Stars are log-normalised to 0–100 (1 M stars = 100, 10 k stars ≈ 67, 100 stars ≈ 33). We use log scale because star counts span several orders of magnitude.

Providers without a clear official GitHub repo are scored using the manual baseline. We don't guess at repo names.

What We Don't Claim

  • We do not measure latency of the actual inference endpoint (that requires an API key)
  • We do not track npm/pip download counts — too easy to inflate
  • We do not monitor HN/Reddit/X sentiment
  • We do not verify free tier limits programmatically — those are manually checked during provider onboarding

Verification Cadence

Health checks run daily via GitHub Actions (.github/workflows/verify.yml). Results are committed to data/health/ and the site rebuilds with updated scores. Human review of free tier details and docManualScore happens on a best-effort basis — typically when a provider announces changes.

Integrity

  • We don't accept payment to boost scores
  • We don't remove providers for being competitors
  • Providers with low scores stay listed — users need to know
  • All scoring code is open-source in lib/scoring.ts