Dataset API Access
How to use the dataset APIs—rate limits, API keys, formats, and tiers.
Archade is the professional network for the built world. Structured data is exported from the knowledge graph for research and tools, under clear terms and with attribution requirements. This page explains how to access the dataset APIs, what rate limits apply, how API keys work, and how to use different formats.
Base URLs
All dataset APIs follow this pattern:
https://archade.com/api/datasets/{entity}.jsonld
Entities: projects, companies, people, software, institutions, products, jobs, media-intel, relationships, experience, education, company-reviews, entity-reviews, posts
Embeddings: embeddings — requires Commercial or Enterprise API key (returns 401 without)
Example:
https://archade.com/api/datasets/projects.jsonld
Query parameters
| Parameter | Description | Default |
|---|---|---|
limit | Max number of entities to return | 100 (without API key), 1,000+ (with key) |
format | Output format: jsonld, json, csv | jsonld |
api_key | Your API key (alternative to header) | — |
Examples:
/api/datasets/projects.jsonld?limit=50
/api/datasets/projects.jsonld?limit=50&format=csv
/api/datasets/companies.jsonld?format=json
Authentication: API key
For higher limits and rate limits, use an API key.
How to get an API key
- Log in to Archade
- Go to Dashboard → Profile → API Keys
- Click Create API Key
- Give it a name, choose a tier (Research, Commercial, or Enterprise)
- Copy the key immediately — you won't see it again
How to use your API key
Option 1: Header (recommended)
X-API-Key: archade_live_abc123...
Option 2: Query parameter
/api/datasets/projects.jsonld?api_key=archade_live_abc123...
Example with curl:
curl -H "X-API-Key: your_key_here" \
"https://archade.com/api/datasets/projects.jsonld?limit=500"
Rate limits
Rate limits depend on whether you use an API key and which tier it has:
| Tier | Requests/minute | Max entities/request | Who |
|---|---|---|---|
| Unauthenticated | 10 | 100 | No API key |
| Research | 30 | 1,000 | API key (Research tier) |
| Commercial | 100 | 10,000 | API key (Commercial tier) |
| Enterprise | 500 | 100,000 | API key (Enterprise tier) |
Response headers (when available):
X-RateLimit-Limit— Your limit per minuteX-RateLimit-Remaining— Requests left in current windowX-RateLimit-Reset— Unix timestamp when the window resetsX-RateLimit-Tier— Your current tierRetry-After— Seconds to wait (when rate limited)
If you exceed the limit: You'll receive 429 Too Many Requests with a Retry-After header. Slow down and retry after the indicated time.
Per-endpoint tracking: Rate limits are tracked per endpoint, meaning requests to /api/datasets/projects.jsonld and /api/datasets/companies.jsonld are counted separately.
Formats
| Format | Query | Content-Type | Use case |
|---|---|---|---|
| JSON-LD | ?format=jsonld (default) | application/ld+json | Schema.org, AI, semantic web |
| JSON | ?format=json | application/json | Same structure, no @context |
| CSV | ?format=csv | text/csv | Spreadsheets, analysis |
| Parquet | ?format=parquet | application/octet-stream | Columnar format for analytics |
CSV notes:
- Nested objects are flattened (e.g.
address.city) - Arrays are joined with
; - Filename includes the date (e.g.
archade-projects-2026-02-01.csv)
Access tiers explained
Research
- Who: Academics, students, non-commercial use
- Limit: 1,000 entities per request, 30 req/min
- How: Create API key, select Research tier
- Cost: Free (contact us for formal research agreements)
Commercial
- Who: Companies building products, tools, or analytics
- Limit: 10,000 entities per request, 100 req/min
- How: Create API key, select Commercial tier
- Cost: Contact legal@archade.app for pricing
Enterprise
- Who: Large-scale use, AI training, custom SLAs
- Limit: 100,000 entities per request, 500 req/min
- How: Contact legal@archade.app for a custom agreement
- Cost: Custom pricing
Note: API keys are per user, not per company. Dataset API access is not included in company subscription tiers (e.g. COMPANY_UNLIMITED). Enterprise agreements may include company-wide keys, dedicated support, and custom terms. Contact us to discuss.
CRUD: Read-only
Dataset APIs are read-only. There is no Create, Update, or Delete. You can only read data. To contribute data, use the Archade platform (profiles, projects, companies, etc.). Dataset APIs exist for researchers and downstream consumers, not for data ingestion.
Pricing and payment
| Tier | Included in company plan? | Payment |
|---|---|---|
| Research | No | Free |
| Commercial | No | Contact legal@archade.app for pricing |
| Enterprise | No (custom agreements may differ) | Contact for quote |
Payment integration: Commercial and Enterprise tiers are handled via direct agreement. No self-serve checkout for dataset API access. Contact legal@archade.app.
Response structure
JSON-LD / JSON:
{
"@context": "https://schema.org",
"@type": "Dataset",
"name": "Archade Projects Dataset",
"url": "https://archade.com/api/datasets/projects.jsonld",
"numberOfItems": 50,
"dateModified": "2026-02-01T...",
"@graph": [ /* array of Schema.org entities */ ]
}
Each entity in @graph follows Schema.org types (e.g. CreativeWork for projects, Organization for companies).
Errors
| Status | Meaning |
|---|---|
| 200 | Success |
| 401 | API key required (Embeddings dataset only; Commercial/Enterprise tier) |
| 403 | Access denied — automated scraping detected, or blocked User-Agent |
| 429 | Rate limit exceeded — slow down, check Retry-After |
| 500 | Server error — try again later |
401 — The Embeddings dataset requires a Commercial or Enterprise API key. Research-tier keys do not grant access.
403 — We block known scraping tools (Scrapy, Selenium, curl, wget, etc.) and requests without a proper User-Agent. For legitimate programmatic use, send a descriptive User-Agent (e.g. MyApp/1.0 (contact@example.com)). Contact help@archade.app if you believe this is in error.
Bot detection and User-Agent
We block requests that look like automated scraping. To avoid being blocked:
- Send a descriptive User-Agent — e.g.
MyApp/1.0 (research@university.edu)orArchade-Dataset-Client/1.0 - Avoid — Scrapy, Selenium, Puppeteer, curl, wget, or generic/empty User-Agents
- Use an API key when making programmatic requests — keys are associated with your account and help identify legitimate use
- Respect rate limits — Implement proper backoff when receiving 429 responses
If you receive 403, add a proper User-Agent header and retry.
Bot detection patterns: We automatically block requests with user agents matching patterns like Scrapy, Selenium, Puppeteer, curl, wget, headless browsers, or those with Cloudflare bot scores below 30.
Best practices
- Cache responses — Data is stable; avoid repeated identical requests.
- Use a descriptive User-Agent — Required for programmatic access; see above.
- Respect rate limits — Implement exponential backoff on 429.
- Store API keys securely — Never commit them to version control.
- Rotate keys periodically — Revoke old keys when creating new ones.
Related
- Datasets overview — What's in each dataset, contributor vs researcher
- Dataset License — Permitted uses, attribution
- API Keys dashboard — Create and manage keys (requires login)
