Methodology
Transparency is a core value. This page explains how data is collected, processed, and presented. Data Race does not modify, estimate, or editorialize the data — the goal is to present official figures as faithfully as possible.
Data Collection
All data is collected programmatically from official public APIs and data providers. Each dataset is fetched using Python scripts that connect directly to provider endpoints, ensuring reproducibility and traceability. No manual data entry is involved.
- Fetch raw data from official APIs and data providers (World Bank Open Data, FAOSTAT, Yahoo Finance, etc.)
- Validate response integrity — check for expected fields, data types, and completeness
- Store validated data in structured tabular format for the processing pipeline
- Log the source URL, fetch timestamp, and record count for each dataset
Data Processing
Raw data undergoes a deterministic processing pipeline. Each step is automated and produces the same output given the same input. Interpolation and estimation are intentionally avoided — if a data point is missing from the source, it remains missing in the dataset.
- Entity identification — ISO 3166-1 alpha-3 for countries (e.g., USA, JPN, DEU), ticker symbols for companies (e.g., AAPL, MSFT)
- Missing value exclusion — gaps are preserved, never filled with estimates
- Regional classification using a fixed mapping of countries to 13 geographic regions
- Rank calculation for each time period — both global and within-region rankings
- Output to Apache Parquet format for efficient browser-based querying
Ranking Calculation
Rankings are recalculated independently for each time period. Only countries with reported data for that specific period are included in the ranking. This means a country's rank may change not only because its value changed, but also because other countries started or stopped reporting.
- Global Rank: Position among all countries with data for that specific period
- Regional Rank: Position within the country's assigned geographic region
- Rank Change: Difference from the previous period's rank position (positive = moved up)
- Year-over-Year Change: Percentage change in the underlying value from the previous period
- Ranking Type: Each dataset is classified as 'best' (higher is better, e.g., GDP), 'worst' (higher is worse, e.g., CO2 emissions), or 'neutral' (no inherent direction, e.g., population)
Visualization
Visualizations run entirely in the browser using DuckDB-Wasm for SQL queries on Parquet files. No data is sent to a server. This architecture ensures fast load times, offline capability, and complete data privacy.
- Bar Chart Race: Animated country rankings showing how positions change over time
- Line Chart: Historical time series with interactive hover tooltips for detailed values
- Pie Chart: Proportional share analysis showing how the global total is distributed
- World Map: Geographic heatmap with color-coded scales for spatial patterns
- Data Table: Sortable rankings with values, rank changes, and year-over-year comparisons
- All charts support regional filtering, country pinning, and period range selection
Data Quality
We prioritize accuracy and transparency over completeness. Rather than filling gaps with estimates, we show only what official sources report. Every visualization links back to its original data source so users can verify the underlying numbers.
- Only use data from established organizations and data providers with documented methodologies
- Preserve original values exactly as reported — no rounding, adjustment, or normalization
- Missing data is excluded rather than estimated or interpolated
- Each dataset page displays the source organization and a direct link to the original data
- Data coverage (number of countries and time range) is shown on every visualization
Known Limitations
No dataset is perfect. Users should consider these limitations when interpreting the visualizations:
- Data availability varies significantly by country and time period — some nations have data from 1960, while others only from the 2000s
- Source organizations may revise historical data retroactively, which means past values can change between updates
- Methodological changes by source organizations (e.g., GDP calculation method changes) may affect year-over-year comparability
- Small countries, territories, and newly independent nations often have incomplete or missing data
- Rankings reflect only countries that reported data for a given period — absence from a ranking does not mean zero value