Data Freshness
Model Graph isn't a static list. A background worker runs daily automated ingestion from multiple sources to keep the registry current. This page explains the ingestion tiers, update frequency, and how conflicts between sources are resolved.
Ingestion Tiers
Data flows into the registry from four tiers of sources, each with different authority levels and refresh frequencies.
Tier 1: Provider Model Listing APIs
The most authoritative source. Each major provider exposes a model listing API that the worker polls daily:
| Provider | Endpoint | Data Available |
|---|---|---|
| OpenAI | GET /v1/models | Model IDs, creation timestamps |
| Anthropic | GET /v1/models | Model IDs, display names, creation timestamps |
GET /v1beta/models | Model names, token limits, supported actions | |
| Mistral | GET /v1/models | Model IDs, deprecation dates, replacement models, aliases, context lengths |
| Cohere | GET /v1/models | Model names, deprecation boolean, context lengths |
| xAI | GET /v1/models | Model IDs (OpenAI-compatible format) |
Mistral's API is the richest — it's the only major provider that exposes deprecation dates, replacement models, and aliases directly in its model listing API.
Tier 2: Deprecation Data
deprecations.info is a community-maintained, daily-scraped aggregation of deprecation announcements from all major providers. It provides:
announcement_date→ mapped todeprecation_dateshutdown_date→ mapped tosunset_datereplacement_models→ mapped tosuccessor_model_id
This fills the deprecation gap for providers whose APIs don't include deprecation information (which is most of them).
Tier 3: Hugging Face Hub
For open-source models, the worker syncs metadata from the Hugging Face Hub for well-known model organizations:
- Meta Llama (
meta-llama/*) - Mistral AI (
mistralai/*) - Cohere (
CohereForAI/*) - Google (
google/*) - Qwen (
Qwen/*) - DeepSeek (
deepseek-ai/*)
Data extracted includes model IDs, creation dates, tags, and parameter counts (parsed from model names or config.json). Each model gets a canonical_url pointing to its Hugging Face page.
Tier 4: Manual Curation
An initial seed migration populates the database with known providers, families, models, and aliases. Admin API endpoints allow manual corrections for edge cases — adding missing aliases, fixing dates, setting successor links, or adjusting statuses.
Update Frequency
| Source | Frequency | Trigger |
|---|---|---|
| Provider APIs (Tier 1) | Daily at 02:00 UTC | Automated cron job |
| deprecations.info (Tier 2) | Daily at 02:00 UTC | Automated cron job |
| Hugging Face Hub (Tier 3) | Daily at 02:00 UTC | Automated cron job |
| Manual curation (Tier 4) | As needed | Admin API call |
All automated sources run in a single daily ingestion cycle. Each source runs independently — if one provider's API is down, the rest still complete successfully.
Conflict Resolution
When multiple sources provide data about the same model, conflicts are resolved using a priority hierarchy:
- Manual curation (highest priority) — admin corrections override everything
- Provider APIs — authoritative for their own models (release dates, names, context windows)
- deprecations.info — supplements with deprecation/sunset dates; never overwrites provider-sourced data
- Hugging Face — authoritative for open-source metadata (canonical URLs, parameter counts)
Key Rules
- Deprecation data from Tier 2 never overwrites release dates or other metadata from Tier 1
- Every ingestion run is idempotent — upserts on natural keys (
slugfor models,aliasfor aliases) - Every run is logged in the
ingestion_runstable with counts and error details - Individual ingester failures don't affect other ingesters — partial success is better than total failure
Monitoring Ingestion
Admins can monitor ingestion health via the admin endpoints:
# View recent ingestion runs
curl -H "Authorization: Bearer $API_KEY" \
https://api.modelgraph.ai/api/v1/admin/ingestion-runs
# Manually trigger a refresh for a specific source
curl -X POST -H "Authorization: Bearer $API_KEY" \
https://api.modelgraph.ai/api/v1/admin/ingest/openai-api
Each run reports models_added, models_updated, and any error_message for debugging.