Datasets
Learn about Archade's structured AEC datasets—what's included, how to explore samples, and who they're for.
Archade licenses structured data derived from its knowledge graph. The provision of such data is intended to facilitate discoverability and attribution within the built environment; it does not constitute a sale of data. Contributors retain control over inclusion; see Dataset Opt-Out.
This page explains what datasets we make available, how to explore them, and what you need to know as a contributor (someone who adds data) or researcher (someone who uses the data). Archade undertakes stewardship of industry data—the provision of structured datasets supports discoverability and attribution.
What's in the datasets
We expose fifteen datasets, aligned with Schema.org where applicable:
Core entities (7)
| Dataset | What it contains | Who it's for |
|---|---|---|
| Projects | Architectural and construction projects with typology, location, credits, images | Researchers, AI labs, search engines |
| Companies | AEC firms, studios, brands with HQ, specializations, project counts | Market intelligence, lead enrichment |
| People | Professionals with verified project experience, skills, location | Talent mapping, research |
| Software | BIM, CAD, design tools with usage stats, platforms, ratings | Adoption analysis, tool research |
| Institutions | Architecture schools, engineering programs, alumni stats | Education research, alumni mapping |
| Products | Building product catalogs with specs, articles, project usage | Specification research, product intelligence |
| Jobs | AEC job postings with location, salary, requirements | Recruitment research, market analysis |
AI enrichment (2)
| Dataset | What it contains | Auth |
|---|---|---|
| Media Intelligence | Images with entity linkage and AI annotations (dense captions, objects, materials, style) | Sample free; API key for more |
| Embeddings | Pre-computed 1536-d vectors for projects (versioned by model) | Commercial or Enterprise API key required |
Relationship graphs (1)
| Dataset | What it contains | Who it's for |
|---|---|---|
| Relationships | Graph edges: project→contributor, project→product, project→software | Attribution research, usage analysis |
Career and education (2)
| Dataset | What it contains | Who it's for |
|---|---|---|
| Experience | Employment records with company, role, dates, skills | Career trajectory, talent research |
| Education | Academic credentials, degrees, institutions | Education research, alumni mapping |
Reviews (2)
| Dataset | What it contains | Who it's for |
|---|---|---|
| Company Reviews | Employer reviews with culture, compensation, work-life balance | Employer intelligence |
| Entity Reviews | Reviews for products, software, institutions | Product/software ratings research |
Content (1)
| Dataset | What it contains | Who it's for |
|---|---|---|
| Posts | Public feed content with media, hashtags, engagement | Content analysis, social signals |
Each dataset includes rich data only—no stubs or placeholders. We filter for records that have the fields needed for meaningful use.
How to explore (no account needed)
You can explore sample data without signing up:
- Browse the overview:
/datasets— See all datasets, entity counts, and links to each. - View a specific dataset: For example,
/datasets/projects— Description, variables, sample preview, and a link to the API. - Download a sample: Each dataset page links to a sample endpoint (e.g.
/api/datasets/projects.jsonld?limit=10). Click to download JSON-LD or copy the URL.
Formats:
- JSON-LD (default) — Schema.org compliant, machine-readable
- JSON — Same structure, no
@context - CSV — Flattened for spreadsheets and analysis
For contributors: Your data, your control
If you're an Archade contributor (profile, company, projects, products):
- Your public data may appear in our datasets. We only include what you've made visible.
- Attribution matters — Credits, team links, and verification flow through. Your work is cited.
- Opt-out — See Dataset Opt-Out if you want your data excluded.
Contributor interests are paramount. Dataset consumers must respect copyright and intellectual property. Use under our Terms and Dataset License only. Violation will be enforced.
For researchers: Samples vs extended access
Free samples (no account, no API key):
- Up to 100 entities per request
- 10 requests per minute per IP
- Enough to test pipelines, evaluate structure, and see what's included
Extended access (API key required):
- Higher limits (1,000–100,000 entities depending on tier)
- Higher rate limits (30–500 requests per minute)
- CSV and JSON export
- See API Access for how to get an API key and what tiers mean
Embeddings only (Commercial or Enterprise API key required):
- The Embeddings dataset requires a Commercial or Enterprise API key
- Research-tier keys do not grant access
- Contact legal@archade.app for Commercial or Enterprise access
Rate limiting and security:
- All dataset endpoints are protected by rate limiting (10-500 requests per minute depending on tier)
- Bot detection blocks automated scraping tools
- Per-endpoint rate limiting ensures fair usage across all datasets
- Use a descriptive User-Agent when making programmatic requests
Enterprise (custom scope, SLAs, support):
- Custom agreements
- Higher limits, webhooks, dedicated support
- Contact us directly: legal@archade.app
What's never included
We never expose:
- Email addresses
- Private messages or DMs
- Payment or billing data
- Account credentials
- Any hidden or direct personal identifiers
See Dataset License and How We Use Public Data for the full picture.
For researchers (no account)
If you're evaluating this data for research or product use, see For Researchers — why this data exists and how it's different.
Related help
- API Access — Rate limits, API keys, formats, tiers, errors, User-Agent
- How the Graph Works — How credits and relationships flow
- Verification — What verification means for profiles and companies
Related legal
- Dataset License — Permitted uses, prohibited uses, attribution
- Content License — How contributors grant rights
- Privacy — How we handle your data
