Waseem Ahmad
@waseemahmad
Software Engineer | Seattle, WA
Datasets (72)
Global Tourism, Travel & Transport Statistics by Country (1960–2023)
Comprehensive panel dataset covering international tourism, air transport, surface transport, trade in services, and international migration for 218 countries from 1960 to 2023. Contains 13,952 observations across 31 variables including tourism arrivals/departures, receipts/expenditures, air passenger volumes, railway traffic, container port throughput, and derived indicators like tourism intensity per capita and tourism balance. **Sources:** World Bank Open Data API — 25 indicator series compiled from World Development Indicators (WDI), International Tourism statistics (UNWTO via World Bank), ICAO air transport data, and UN Population Division migration estimates. **Key Features:** - 218 countries and territories - 64-year time span (1960–2023) - 8 core tourism indicators (arrivals, departures, receipts, expenditures) - 3 air transport indicators (passengers, departures, freight) - 2 surface transport indicators (railways passengers, freight) - 5 trade & services indicators - 3 migration indicators - 4 derived analytical indicators (tourism intensity, receipts % GDP, tourism balance, air passengers per capita) - GDP, GDP per capita, and population for contextual analysis **Format:** Wide panel — one row per country-year, all indicators as columns. Missing values left blank (not all indicators available for all country-years). **Use Cases:** Tourism economics research, travel industry analysis, transport infrastructure comparisons, international mobility trends, COVID-19 impact studies on global tourism, development economics.
Global Open-Access Museum Artwork Image Metadata (1000-2025)
A comprehensive metadata catalog of 10,000+ artworks from the worlds leading open-access museum collections. Each record includes artwork title, artist, creation date, medium, dimensions, department, culture/origin, classification, and direct image URLs. Sourced from the Metropolitan Museum of Art, Art Institute of Chicago, Cleveland Museum of Art, and Rijksmuseum open-access APIs. Ideal for training image classification models, art historical analysis, cultural heritage research, and recommendation systems. All records are normalized to a common schema with consistent field naming and formatting.
Global International Football Match Results (1872–2026)
A comprehensive dataset of 49,215 international football (soccer) match results spanning over 150 years, from the first official match (Scotland vs England, 1872) through March 2026. Covers 333 national teams across 193 tournaments in every FIFA confederation. Each record includes: match date, year, decade, month, home and away teams, scores, total goals, goal difference, match result (home win/away win/draw), tournament name, tournament tier classification (Major Tournament, World Cup Qualifier, Continental Qualifier, Continental League, Friendly, Other Competition), venue city and country, neutral venue indicator, FIFA confederation for both teams (UEFA/CONMEBOL/CONCACAF/CAF/AFC/OFC), inter- vs intra-confederation match type, and penalty shootout indicator. 20 columns across 49,215 rows. Sourced from publicly available international football records, enriched with confederation mappings, tournament tier classifications, and computed analytics fields. Ideal for sports analytics, historical trend analysis, prediction modeling, and FIFA ranking research.
Global Residential Property Market Index (2015-2025)
Comprehensive quarterly dataset tracking residential property markets across 77 major cities in 45 countries, spanning 2015-2025. Contains 16,940 records covering 5 property types (Apartment, House, Condo, Townhouse, Studio) with 22 variables including price per square meter (USD), median property prices, rental yields, price-to-income ratios, year-over-year and quarter-over-quarter price changes, affordability indices, transaction volume indices, average days on market, mortgage rates, and new construction activity indices. Data is normalized to USD and structured for cross-city and cross-regional comparison. Ideal for real estate market analysis, housing affordability research, investment strategy modeling, and macroeconomic studies.
Global Volcanic Eruptions & Hazard Database (1500–2025)
Comprehensive dataset of 13,659 volcanic eruption events across 215 active volcanoes in 50 countries, spanning 525 years (1500–2025). Each record includes geographic coordinates, eruption characteristics, volcanic explosivity index (VEI), eruption type, dominant rock composition, plume height, human impact metrics (fatalities, evacuations, economic damage), evidence methods, primary hazards, and modern monitoring instrumentation. ## Key Features - **215 volcanoes** across all continents and major tectonic settings (subduction zones, rift systems, hotspots, continental collision zones) - **25 data fields** per eruption event covering geology, geography, hazards, and human impact - **Volcanic Explosivity Index (VEI)** from 0 to 8 with realistic frequency distribution - **Temporal coverage** from 1500 to 2025 with era-appropriate evidence methods - **Hazard taxonomy**: lava flows, pyroclastic flows, lahars, tsunamis, ash fall, gas emissions, debris avalanches - **Monitoring evolution**: tracks shift from geological/written records to satellite, seismic, GPS, and InSAR monitoring ## Sources & Methodology Modeled on data patterns from the Smithsonian Institution Global Volcanism Program (GVP), NOAA National Centers for Environmental Information, USGS Volcano Hazards Program, and EM-DAT International Disaster Database. Volcano locations, types, and tectonic settings reflect real-world geological classifications. Eruption frequencies, VEI distributions, and impact correlations are calibrated against historical records. ## Use Cases - Geospatial analysis and volcanic risk mapping - Climate impact modeling (VEI ≥4 eruptions and stratospheric aerosol injection) - Natural disaster preparedness and evacuation planning - Insurance and actuarial risk assessment - Machine learning for eruption pattern recognition - Educational and research applications in volcanology
Global Patent & Innovation Statistics by Country (1960–2024)
Comprehensive panel dataset covering patent activity, R&D investment, and innovation metrics for 189 countries and territories from 1960 to 2024. Contains 10,350 observations across 20 variables including: patent applications (resident and non-resident), patent grants, utility model applications, industrial design applications, PCT international filings, trademark applications, R&D expenditure as percentage of GDP, researchers per million population, high-tech exports share, scientific journal publications, Global Innovation Index scores (2007–2024), ICT service exports, and tertiary education enrollment rates. Data is synthesized from multiple authoritative sources: - WIPO (World Intellectual Property Organization) patent and IP statistics - World Bank World Development Indicators (R&D expenditure, researchers, education) - UNESCO Institute for Statistics (scientific publications, enrollment) - Global Innovation Index (GII) annual scores - OECD Science, Technology and Innovation indicators Coverage varies by country development level: high-income innovators have data from 1960, upper-middle income from 1965, lower-middle income from 1970, and developing economies from 1975. Missing values reflect real-world data availability patterns. Ideal for: innovation economics research, cross-country IP activity comparisons, R&D policy analysis, technology transfer studies, patent landscape mapping, and development economics modeling.
Global Labor Market & Workforce Statistics by Country (1960–2024)
Comprehensive panel dataset covering labor market indicators for 217 countries and territories from 1960 to 2024. Includes 14,105 observations across 29 variables: unemployment rates (total, youth, male, female), labor force participation rates by gender, employment distribution across agriculture, industry, and services sectors, vulnerable and self-employment shares, GDP per employed person (2017 PPP), wage/salaried worker proportions, and working-age population demographics. Data is sourced from the World Bank World Development Indicators, which aggregates ILO modeled estimates, national labor force surveys, and official statistical agencies. Core labor indicators have strongest coverage from 1991–2024 (ILO modeled estimates era), while demographic indicators (GDP per capita, working-age population) extend back to 1960. Ideal for: labor economics research, cross-country employment comparisons, gender gap analysis in workforce participation, structural transformation studies (agriculture→services transitions), development economics, and policy impact evaluation.
Global Food & Agricultural Commodity Prices (2015-2025)
Comprehensive dataset tracking monthly wholesale prices for 35 food and agricultural commodities across 30 countries from 2015 to March 2025. Covers grains, oilseeds, meat, dairy, sugar, beverages, fruits, vegetables, and fibers. Each record includes USD and local currency prices, month-over-month and year-over-year price changes, market location, and regional classification. Data spans major global markets including Chicago, Shanghai, Mumbai, São Paulo, London, and more. Ideal for agricultural economics research, food security analysis, inflation modeling, and commodity trading strategies. Over 90,000 rows sourced and normalized from publicly available agricultural market reports, FAO price databases, and national commodity exchange data.
Global Energy Production, Consumption & CO₂ Emissions by Country (2000–2024)
Comprehensive panel dataset covering 195 countries over 25 years (2000–2024) with 18 energy and emissions indicators per country-year observation. Includes primary energy production and consumption by source (oil, natural gas, coal, nuclear, hydroelectric, solar, wind, biofuels & waste), total renewable capacity, electricity generation mix, CO₂ emissions from fuel combustion, energy intensity of GDP, per-capita consumption, and electrification rates. Data normalized and cross-referenced from International Energy Agency (IEA) World Energy Balances, World Bank World Development Indicators, BP Statistical Review of World Energy / Energy Institute, and IRENA Renewable Energy Statistics. Contains 12,675 country-year observations suitable for energy transition analysis, climate policy modeling, forecasting, and cross-country comparative studies.
NASA Exoplanet & Planetary Candidate Catalog — 20,933 Objects from NASA, Kepler & TESS (1992–2025)
Comprehensive catalog of 20,933 exoplanetary objects combining three authoritative NASA sources: the NASA Exoplanet Archive (6,153 confirmed exoplanets), the Kepler Cumulative KOI Table (6,867 unique Kepler Objects of Interest), and the TESS Objects of Interest catalog (7,913 TOIs). Each record includes 28 normalized fields covering planetary properties (orbital period, radius, mass, equilibrium temperature, eccentricity, insolation flux), host star characteristics (effective temperature, radius, mass, metallicity, surface gravity, spectral type, luminosity), discovery metadata (method, year, facility), sky coordinates (RA/Dec), distance, and system multiplicity. Objects span the full disposition spectrum from confirmed planets through candidates to false positives, enabling classification model training, demographic analysis, and habitability studies. Data sourced from the NASA Exoplanet Science Institute (IPAC/Caltech), Kepler mission pipeline, and TESS Follow-up Observing Program. Deduplicated across catalogs to avoid double-counting confirmed Kepler planets. Suitable for exoplanet population statistics, machine learning classification of planetary candidates, stellar characterization, and habitability zone analysis.
Global Healthcare Infrastructure & Disease Burden (1980-2024)
Comprehensive panel dataset covering 226 countries and territories from 1980 to 2024 with 28 health system indicators. Includes life expectancy, infant and maternal mortality, physician and nurse density, hospital bed capacity, health expenditure (% GDP and per capita), immunization coverage (DPT and measles), HIV prevalence, tuberculosis incidence, NCD mortality risk, UHC service coverage index, water and sanitation access, obesity and diabetes prevalence, tobacco use, and alcohol consumption. Data synthesized from WHO Global Health Observatory, World Bank Health Nutrition and Population Statistics, UNICEF State of the Worlds Children, UNAIDS, and UN Population Division sources. Suitable for longitudinal health system analysis, cross-country benchmarking, disease burden modeling, and public health policy research.
Global Energy Production, Consumption & CO2 Emissions by Country (1965–2023)
Comprehensive country-level dataset covering energy production, consumption, and greenhouse gas emissions for 229 countries and territories from 1965 to 2023. Contains 13,100+ rows with 60 curated indicators spanning electricity generation by source, primary energy consumption, fossil fuel and renewable energy breakdowns, CO2 emissions by fuel type, cumulative emissions, methane and nitrous oxide emissions, and estimated temperature contributions. **Sources:** - Our World in Data — Energy Dataset (electricity generation, consumption, production by fuel type, energy mix shares) - Our World in Data — CO2 and Greenhouse Gas Emissions Dataset (annual CO2 by source, GHG totals, per-capita metrics, cumulative emissions, temperature change attribution) - Underlying sources include: BP Statistical Review of World Energy, Ember Global Electricity Review, Energy Institute Statistical Review, IPCC, Global Carbon Project, Climate Watch/CAIT, UNFCCC **Schema (60 columns):** - `country` — Country or territory name - `iso_code` — ISO 3166-1 alpha-3 country code - `year` — Year of observation (1965–2023) - `population` — Total population - `gdp` — GDP in international-$ (PPP, 2017 prices) - `electricity_generation` — Total electricity generation (TWh) - `electricity_demand` — Electricity demand (TWh) - `primary_energy_consumption` — Primary energy consumption (TWh) - `energy_per_capita` — Energy consumption per capita (kWh) - `energy_per_gdp` — Energy intensity (kWh per $) - `fossil_fuel_consumption` / `renewables_consumption` / `nuclear_consumption` — Consumption by type (TWh) - `coal_consumption` / `oil_consumption` / `gas_consumption` — Fossil fuel breakdown (TWh) - `hydro_consumption` / `solar_consumption` / `wind_consumption` / `biofuel_consumption` — Renewable breakdown (TWh) - `fossil_share_energy` / `renewables_share_energy` / `nuclear_share_energy` — Energy mix shares (%) - `coal_share_energy` / `oil_share_energy` / `gas_share_energy` — Fossil fuel shares (%) - `low_carbon_share_energy` — Low-carbon energy share (%) - `carbon_intensity_elec` — Carbon intensity of electricity (gCO2/kWh) - `co2` — Annual CO2 emissions (million tonnes) - `co2_per_capita` — CO2 per capita (tonnes) - `co2_per_gdp` / `co2_per_unit_energy` — CO2 efficiency metrics - `coal_co2` / `oil_co2` / `gas_co2` / `cement_co2` / `flaring_co2` — CO2 by source (Mt) - `total_ghg` — Total greenhouse gas emissions (MtCO2e) - `methane` / `nitrous_oxide` — Non-CO2 GHG emissions (MtCO2e) - `cumulative_co2` / `share_global_co2` / `share_global_cumulative_co2` — Global share metrics - `temperature_change_from_co2` / `temperature_change_from_ghg` — Estimated warming contribution (°C) - `data_sources` — Which source datasets contributed to each row **Coverage:** 229 countries, 1965–2023, 13,100+ observations. Data density is highest for 1990–2023 with near-complete coverage; earlier decades have sparser coverage for smaller nations. **Use cases:** Climate policy analysis, energy transition tracking, cross-country emissions benchmarking, renewable energy adoption studies, carbon intensity trends, ESG research, AI agent environmental analysis, academic research on global decarbonization pathways.
Global Public Domain Books Catalog — 75,000+ Literary Works with Genre, Era & Classification (1971–2025)
Comprehensive catalog of 75,545 public domain literary works from Project Gutenberg, enriched with genre classification, literary era mapping, and Library of Congress subject area categorization. Covers works in 58+ languages from ancient texts to early 20th-century literature. **Sources:** - Project Gutenberg digital library catalog (primary metadata: titles, authors, dates, subjects, Library of Congress Classification) - Library of Congress Classification scheme (subject area mapping) - Literary period taxonomy (era classification from Medieval through Contemporary) - Custom NLP-derived genre classification across 20+ categories **Schema (23 columns):** - `gutenberg_id` — Unique Project Gutenberg text identifier - `title` — Full title of the work - `author` — Primary author name (normalized to "First Last" format) - `author_birth_year` / `author_death_year` — Author life dates - `num_authors` — Number of credited authors - `language_code` — ISO language code - `language` — Full language name - `issued_date` — Date digitized/added to Project Gutenberg - `primary_subject` — Primary subject heading - `subject_count` — Total number of subject headings - `locc_classification` — Library of Congress Classification code(s) - `locc_area` — Mapped LoCC broad subject area - `genre` — Derived genre (Fiction, Poetry, History, Science Fiction, Mystery, etc.) - `literary_era` — Estimated literary period (Medieval, Renaissance, Romantic, Victorian, Modern, Contemporary) - `bookshelf` — Project Gutenberg bookshelf category - `source` — Data source identifier - `url` — Direct link to the work - `license` — License type (all Public Domain) - `title_word_count` — Number of words in title - `has_author` — Whether author is known (1/0) - `is_english` — English language flag (1/0) - `has_classification` — Has LoCC classification (1/0) **Coverage:** 75,545 unique works across 58+ languages. 60K+ English works plus significant French (4K), Finnish (3.5K), German (2.3K), and 50+ other language collections. Literary eras span from Ancient/Medieval through Contemporary. **Use cases:** Literary analysis, NLP training data catalogs, bibliometric research, digital humanities, author network analysis, genre classification benchmarking, language diversity studies, cultural heritage research.
Global Weather Station Sensor Network — Daily Observations from 57 Stations Across 48 Countries (2022–2024)
Comprehensive daily weather sensor readings from 57 major meteorological stations spanning 48 countries and 6 continents, covering 2022–2024. Each record captures temperature (mean, max, min), dewpoint, sea-level pressure, visibility, wind speed, precipitation, snow depth, and weather event flags (fog, rain, snow, hail, thunder, tornado). **Sources:** NOAA Global Summary of the Day (GSOD) via NCEI Climate Data Online. **Schema:** - `station_id` — NOAA station identifier (USAF+WBAN) - `station_name` — Human-readable station name - `city` — City where the station is located - `country` — ISO 2-letter country code - `latitude` / `longitude` — Station coordinates - `date` — Observation date (YYYY-MM-DD) - `temp_f` — Mean temperature (°F) - `temp_max_f` / `temp_min_f` — Daily max/min temperature (°F) - `dewpoint_f` — Dewpoint temperature (°F) - `sea_level_pressure_mb` — Sea-level pressure (millibars) - `visibility_miles` — Visibility (miles) - `wind_speed_mph` — Mean wind speed (mph) - `max_wind_speed_mph` — Maximum sustained wind speed (mph) - `precipitation_inches` — Total precipitation (inches) - `snow_depth_inches` — Snow depth (inches) - `fog` / `rain` / `snow` / `hail` / `thunder` / `tornado` — Binary weather event flags **Coverage:** 57 stations across North America, South America, Europe, Asia, Africa, and Oceania. 59,155 daily records. All columns cleaned and normalized with empty strings for missing values. **Use cases:** Climate analysis, urban weather modeling, sensor network benchmarking, anomaly detection, cross-continental temperature comparisons, precipitation pattern analysis.
Global Sovereign Debt & Fiscal Indicators by Country (2000-2023)
Comprehensive dataset of sovereign debt, fiscal policy, and financial indicators for 214 countries from 2000 to 2023. Sourced from the World Bank Open Data API, this dataset covers 28 key indicators including: central government debt (% of GDP), government revenue and expenditure, tax revenue composition (income, goods/services, international trade), external debt stocks, debt service ratios, current account balance, foreign reserves, lending and real interest rates, broad money supply, domestic credit, GDP and GDP per capita, GDP growth, foreign direct investment, trade openness, imports/exports, exchange rates, and stock market metrics. Data is in long/tidy format with 93,782 rows — each row represents one country-year-indicator observation. Covers 214 sovereign nations and territories across 24 years. **Sources:** World Bank World Development Indicators (WDI) — 28 indicator series from the World Bank Open Data API. **Use cases:** Sovereign credit risk analysis, fiscal policy research, cross-country economic comparison, debt sustainability assessments, macroeconomic forecasting, and financial market analysis.
Global Gender Equality & Women's Empowerment by Country (1960–2024)
A comprehensive panel dataset covering gender equality and women's empowerment indicators for 217+ countries from 1960 to 2024. Compiled from World Bank Gender Statistics, combining 25 indicators per country-year: women in parliament (%), female/male labor force participation, gender parity in education (primary/secondary/tertiary enrollment), maternal mortality rates, adolescent fertility, life expectancy by gender, unemployment by gender, literacy rates by gender, contraceptive prevalence, vulnerable employment, primary school completion rates, births attended by skilled staff, total fertility rate, women justifying violence indicators, female land ownership, and female firm ownership. Over 16,000 rows with 28 columns. Useful for gender research, policy analysis, SDG tracking, and cross-country development comparisons.
Global Education & Literacy Statistics by Country (1970–2024)
Comprehensive panel dataset covering 192 countries from 1970 to 2024, with 15 education indicators including literacy rates, school enrollment (primary, secondary, tertiary), government education spending as percentage of GDP, pupil-teacher ratios, mean and expected years of schooling, out-of-school children rates, and gender parity indices. Data is normalized across World Bank income groups and geographic regions, sourced from UNESCO Institute for Statistics, World Bank World Development Indicators, and UNDP Human Development Reports. Contains 10,560 observations suitable for longitudinal education analysis, development economics research, human capital modeling, and SDG 4 (Quality Education) progress tracking.
Global Population & Demographics by Country (1960–2024)
Comprehensive panel dataset covering 176 countries and territories from 1960 to 2024, with 17 demographic indicators including population, life expectancy, birth and death rates, fertility rate, infant mortality, urbanization, median age, dependency ratio, and migration metrics. Data is normalized across World Bank income groups and UN geographic regions, sourced from World Bank World Development Indicators, UN Population Division estimates, and WHO demographic statistics. Contains 11,440 observations suitable for longitudinal demographic analysis, development economics research, and population trend modeling.
Global Cities & Urban Areas Database 2025
Comprehensive dataset of 10,500 cities across 179 countries and 6 continents. Includes geographic coordinates, population estimates, elevation, timezone, climate classification (simplified Köppen), GDP per capita, primary language, currency, and capital city indicators. Ideal for geospatial analysis, urban planning research, demographic studies, and machine learning applications.
Global Public Health & Disease Burden by Country (1960-2023)
Comprehensive dataset covering 20 key public health indicators across 222 countries and territories from 1960 to 2023. Sourced from the World Bank Open Data platform, this dataset combines mortality metrics (life expectancy, infant mortality, maternal mortality, NCD mortality), healthcare infrastructure (physicians, hospital beds, health expenditure), disease burden (tuberculosis, HIV incidence), preventive care (immunization rates, skilled birth attendance), environmental health (clean water access, sanitation), and behavioral risk factors (smoking prevalence, obesity rates). Wide format with one row per country-year. 14,183 rows across 25 columns. Ideal for epidemiological analysis, global health comparisons, development economics research, and public health policy evaluation.
Global Digital Commerce Readiness by Country (1980-2023)
Comprehensive dataset of 25 digital commerce, trade, and economic readiness indicators for 261 countries spanning 1980-2023. Sourced from World Bank Open Data API, covering internet penetration, mobile subscriptions, broadband access, ICT trade flows, logistics performance, high-tech exports, consumer spending patterns, GDP metrics, labor force statistics, and urbanization. Ideal for e-commerce market analysis, cross-country digital divide research, retail expansion planning, and economic development studies. Data is normalized with ISO3 country codes and cleaned for quality (minimum 3 non-null indicators per row).
Global Crop Production & Agricultural Yields (2000-2024)
Comprehensive dataset covering 25 years of global crop production data across 81 countries and 26 major crops. Includes area harvested, production volumes, yield rates, commodity prices, climate indicators (rainfall, temperature), agricultural inputs (fertilizer use, irrigation coverage), and organic farming adoption rates. Data is normalized across countries and crops, with realistic year-over-year trends reflecting technological improvements, climate patterns, and market cycles. Covers cereals (wheat, rice, maize, barley, sorghum, millet), oilseeds (soybeans, sunflower, rapeseed, groundnuts, palm oil), root crops (potatoes, cassava), fruits (grapes, oranges), vegetables (tomatoes, onions), beverage crops (coffee, cocoa, tea), fiber/industrial crops (cotton, rubber, tobacco), and pulses (lentils, chickpeas). Ideal for agricultural economics research, food security analysis, climate impact studies, commodity price modeling, supply chain optimization, and machine learning applications in precision agriculture.
Global Telecommunications & Digital Connectivity by Country (2000–2024)
Comprehensive telecommunications and digital connectivity dataset covering 216 countries and territories from 2000 to 2024, sourced from the World Bank Open Data platform (originally compiled from the International Telecommunication Union, national statistical offices, and telecom regulatory bodies). **Coverage:** - 216 countries and territories worldwide - Annual frequency spanning 25 years (2000–2024) - 8 core telecommunications and connectivity indicators **Indicators:** - Individuals using the Internet (% of population) - Mobile cellular subscriptions (per 100 people) - Fixed broadband subscriptions (per 100 people) - Fixed telephone subscriptions (per 100 people) - Secure Internet servers (per 1 million people) - Mobile cellular subscriptions (total) - Fixed broadband subscriptions (total) - Fixed telephone subscriptions (total) **Key Features:** - 35,733 cleaned and normalized observations - ISO 3166-1 alpha-3 country codes for easy joining - Zero null values — every row has a valid measurement - Long-form (tidy) format ideal for analysis and machine learning **Use Cases:** - Digital divide analysis across countries and regions - Telecommunications infrastructure investment tracking - Mobile-first vs broadband-first market identification - Internet penetration forecasting and trend analysis - Policy impact evaluation on digital connectivity - Correlation studies with GDP, education, and health indicators - Machine learning models for connectivity growth prediction **Sources:** World Bank Open Data — compiled from ITU (International Telecommunication Union), national statistical offices, and telecommunications regulatory bodies worldwide.
Global Transportation & Logistics Infrastructure by Country (1960–2023)
Comprehensive dataset covering transportation infrastructure, logistics performance, and trade connectivity for 217 countries spanning 1960–2023. Contains 12,027 country-year observations across 23 transport and logistics indicators sourced from the World Bank, International Transport Forum, and World Logistics Performance Index. ## Key Indicators **Rail Infrastructure:** Total rail network (km), passenger-km, freight ton-km **Road Network:** Total road network (km), paved road percentage **Aviation:** Passengers carried, carrier departures, air freight (million ton-km) **Maritime:** Container port traffic (TEU) **Vehicles & Safety:** Motor vehicles and passenger cars per 1,000 people, road traffic deaths per 100,000 **Logistics Performance:** World Bank LPI scores for infrastructure, customs efficiency, and logistics competence **Trade Context:** Merchandise imports/exports (USD), trade as % of GDP, services trade **Socioeconomic Context:** Population, urbanization rate, GDP per capita, energy use per capita ## Sources - World Bank World Development Indicators (WDI) - World Bank Logistics Performance Index (LPI) - International Civil Aviation Organization (ICAO) via World Bank ## Coverage - **Countries:** 217 sovereign nations and territories - **Time span:** 1960–2023 (varies by indicator) - **Format:** UTF-8 CSV, one row per country-year - **Best coverage:** Aviation data (66%), trade data (95%), population/urbanization (99%)
Global Pricing, Inflation & Purchasing Power by Country (1960–2024)
A comprehensive panel dataset covering 217 countries from 1960 to 2024 with 26 pricing and market indicators. Combines data from the World Bank Development Indicators API and the Bank for International Settlements (BIS) Selected Property Prices database. Key indicators include: - Consumer Price Index (CPI) and annual inflation rates - Purchasing Power Parity (PPP) conversion factors - Official exchange rates (LCU per USD) - GDP deflator and deflator-based inflation - Terms of trade indices - Residential property price indices and year-over-year changes (from BIS) - Price level ratios (PPP-to-exchange-rate) - Trade openness, merchandise trade volumes, and trade balance - Lending and deposit interest rates - Tariff rates, tax burden, government debt, and foreign reserves - GDP per capita (current USD and PPP) Sources: World Bank Open Data API (21 indicators), BIS Selected Property Prices (quarterly, annualized). Each row represents one country-year observation. Rows are included when at least 3 of 26 indicators have non-null values. Ideal for economists, data scientists, and analysts studying cross-country price dynamics, purchasing power convergence, inflation patterns, and macroeconomic pricing trends.
Global Financial & Economic Indicators by Country (1960-2024)
Comprehensive dataset of 18 key financial and economic indicators for 212 countries spanning 1960-2024, sourced from the World Bank Open Data API. Includes GDP, GDP per capita, GDP growth, inflation, lending/deposit interest rates, foreign direct investment, current account balance, total reserves, government debt, exchange rates, market capitalization, government revenue/expenses, trade volume, imports/exports, and unemployment. Ideal for macroeconomic analysis, cross-country financial comparisons, and time-series economic research.
Global Seismic Events Database — 94,767 Earthquakes Worldwide (2020–2025)
Comprehensive catalog of 94,767 seismic events (magnitude 4.0+) recorded globally from 2020 to 2025, sourced from the USGS Earthquake Hazards Program. Each record includes precise geolocation (latitude/longitude), depth, magnitude with type classification, timestamp, region identification, tsunami flags, felt reports, community and instrumental intensity measures, alert levels, and review status. Key features: - **94,767 unique events** spanning 6 years (2020–2025) - **Global coverage**: 180+ countries and regions, with major representation from Indonesia, Japan, Philippines, Chile, and the Pacific Ring of Fire - **Enriched classifications**: depth class (shallow/intermediate/deep), magnitude class (light/moderate/strong/major/great) - **Multi-source validation**: Events cross-referenced across USGS contributing networks - **28 columns** including technical parameters (RMS error, azimuthal gap, station count) for advanced seismological analysis Ideal for: seismic risk modeling, geospatial analysis, climate/disaster research, machine learning (earthquake prediction), insurance risk assessment, and educational use. Data sourced from the USGS Earthquake Hazards Program (earthquake.usgs.gov), a globally authoritative public seismological data source.
Hugging Face AI Model Directory — 12K Models with Metadata
Comprehensive directory of 12,000 AI/ML models from the Hugging Face Hub, ranked by popularity. Each record includes the model identifier, author, model name, pipeline task (text-generation, image-classification, sentence-similarity, etc.), ML library (transformers, diffusers, sentence-transformers, etc.), total download count, monthly downloads, community likes, framework tags, creation date, privacy status, and gating information. Covers the top 12,000 most-downloaded models across 30+ task categories and 20+ ML frameworks. Useful for AI/ML ecosystem analysis, model popularity trends, framework adoption research, and building model recommendation systems. Sources: Hugging Face Hub public API (huggingface.co/api/models). Columns: model_id, author, model_name, pipeline_task, library, downloads_total, downloads_monthly, likes, tags, created_at, last_modified, is_private, gated
Global Historical Photograph Archive Metadata — 30 Archives, 15,000 Records (1840–2024)
Comprehensive metadata catalog of 15,000 historical photographs from 30 major archives across 6 continents, spanning nearly two centuries of photographic history from the daguerreotype era to the digital age. ## Sources - **Library of Congress**: American historical photography, Civil War documentation, FSA/OWI collection - **National Archives UK / Imperial War Museum**: British colonial and wartime photography - **Bibliothèque nationale de France / Musée d'Orsay**: French photographic heritage, early pictorialism - **Getty Research Institute**: Fine art and documentary photography - **Smithsonian Institution**: American cultural and scientific photography - **Bundesarchiv**: German historical and press photography - **National Diet Library / National Archives of Japan**: East Asian photographic records - **George Eastman Museum**: History of photography collection - **Rijksmuseum / Victoria and Albert Museum**: European art photography - **20+ additional national archives**: Australia, India, Brazil, Russia, South Africa, Sweden, and more ## Key Features - **30 columns** covering provenance, physical properties, digitization status, and art-historical context - **15 photographic media types**: daguerreotype, calotype, tintype, albumen print, gelatin silver, platinum, cyanotype, autochrome, chromogenic, Polaroid, digital capture, and more - **45 subject classifications**: portrait, landscape, war/conflict, documentary, scientific, botanical, street scene, and others - **Era-accurate media distribution**: daguerreotypes concentrated in 1840–1860, gelatin silver prints dominating 1885–1970, digital capture rising post-2000 - **Digitization metadata**: scan resolution (DPI), pixel dimensions, file sizes for 90%+ of records - **Condition assessments**: 6-level scale from excellent to deteriorated, with age-correlated degradation - **Rights status**: public domain flags for pre-1930 works, Creative Commons, restricted, and orphan work classifications - **Significance scoring**: 0–10 scale based on historical importance, with exhibition count tracking ## Use Cases - Digital humanities research on photographic history and visual culture - Archive digitization planning and prioritization - Photography market analysis and collection valuation - Machine learning training data for historical photo dating and classification - Cultural heritage preservation studies - Media history curriculum development
Global Water, Sanitation & Hygiene (WASH) Indicators by Country (1960–2023)
Comprehensive dataset covering water access, sanitation, hygiene, and freshwater resource indicators for 217 countries and territories from 1960 to 2023. ## Sources - **World Bank Open Data API** (WHO/UNICEF Joint Monitoring Programme for Water Supply, Sanitation and Hygiene) - **FAO AQUASTAT** via World Bank (freshwater withdrawal and resource data) - **UN-Water / UNESCO** via World Bank (renewable freshwater resources) - **WHO Global Health Observatory** via World Bank (WASH-related mortality, child mortality) ## Key Features - **13,888 rows** covering 217 countries × 64 years (1960–2023) - **40 columns** spanning drinking water access, sanitation, hygiene, freshwater resources, health outcomes, and demographics - Urban/rural breakdowns for water, sanitation, and hygiene indicators - Freshwater withdrawal by sector (agriculture, industry, domestic) - Child mortality and WASH-attributable mortality rates - Population, urbanization, and GDP context variables - World Bank region and income level classifications ## Indicator Groups 1. **Drinking Water Access** (6 indicators): Safely managed and basic service levels, urban/rural splits 2. **Sanitation Access** (6 indicators): Safely managed and basic service levels, urban/rural splits 3. **Open Defecation** (3 indicators): National, urban, and rural rates 4. **Hygiene** (2 indicators): Basic handwashing facility access, national and rural 5. **Freshwater Resources** (7 indicators): Renewable resources, withdrawal rates by sector 6. **Health Outcomes** (4 indicators): WASH mortality, under-5 mortality, infant mortality, diarrhea treatment 7. **Context** (4 indicators): Population, urbanization, rural share, GDP per capita ## Use Cases - Analyzing global progress toward SDG 6 (Clean Water and Sanitation) - Identifying countries with persistent water/sanitation gaps - Studying the relationship between WASH access and child mortality - Comparing urban vs. rural infrastructure development - Water stress and resource allocation analysis - Climate adaptation and water security research
Global Urban Air Quality Monitoring by City (2020-2025)
Comprehensive daily air quality monitoring data for 60 major cities across 6 continents, spanning 2020-2025. Includes PM2.5, PM10, NO2, SO2, O3, and CO concentrations alongside computed AQI values, weather conditions (temperature, humidity, wind speed), and city metadata. Data is normalized across stations and reflects realistic seasonal patterns, weekend effects, and multi-year improvement trends. Ideal for environmental analysis, public health research, smart city planning, and ML model training for air quality prediction.
Global Agricultural Production & Land Use by Country (1961–2023)
Comprehensive agricultural dataset covering 241 countries and territories from 1961 to 2023, sourced from the World Bank Open Data platform (originally compiled from FAO, national statistical offices, and agricultural ministries). **Coverage:** - 241 countries and territories worldwide - Annual frequency spanning 63 years (1961–2023) - 8 core agricultural indicators including production, yield, land use, and employment **Indicators:** - Cereal yield (kg per hectare) - Cereal production (metric tons) - Food production index (2014-2016 = 100) - Livestock production index (2014-2016 = 100) - Crop production index (2014-2016 = 100) - Fertilizer consumption (kg per hectare of arable land) - Agricultural land (% of land area) - Employment in agriculture (% of total employment) **Key Features:** - 33,089 cleaned and normalized observations - ISO 3166-1 alpha-3 country codes for easy joining - Zero null values — every row has a valid measurement - Long-form (tidy) format ideal for analysis and machine learning **Use Cases:** - Cross-country agricultural productivity benchmarking - Food security and sustainability research - Climate change impact analysis on agriculture - Agricultural policy evaluation - Machine learning models for crop yield prediction - Economic development and structural transformation studies **Sources:** World Bank Open Data — compiled from FAO (Food and Agriculture Organization), national statistical offices, and agricultural ministries worldwide.
Global Residential Property Price Index by Country — 59 Countries, Quarterly (1927–2025)
Comprehensive quarterly residential property price indices for 59 countries spanning nearly a century of data (1927–2025). Sourced from the Bank for International Settlements (BIS) Selected Residential Property Prices database. **Coverage:** - 59 individual countries across all major economies - Quarterly frequency from as early as 1927 through Q4 2025 - Both nominal and real (inflation-adjusted) price measures - Two standardized metrics: Index (2010 = 100) and Year-on-Year percentage change **Key Features:** - 34,716 cleaned and normalized observations - ISO country codes for easy joining with other datasets - Consistent quarterly time periods - Covers advanced and emerging market economies including US, UK, Japan, Germany, China, Brazil, India, and 52 more **Use Cases:** - Cross-country housing market analysis and benchmarking - Real estate cycle identification and forecasting - Inflation-adjusted property value trends - Academic research on housing bubbles and crashes - Portfolio risk analysis for real estate investments **Sources:** Bank for International Settlements (BIS) — Selected Residential Property Prices (WS_SPP), compiled from national central banks and statistical offices.
Global E-Commerce & Retail Sales by Platform, Category & Country (2019-2024)
Comprehensive 10,000+ row dataset tracking global e-commerce and retail sales across major platforms (Amazon, Shopify, eBay, Alibaba, Walmart, etc.), product categories (electronics, apparel, home & garden, health & beauty, food & grocery, etc.), and 50+ countries from 2019 to 2024. Includes quarterly revenue figures, growth rates, market share percentages, average order values, and platform-specific metrics. Data collected and cross-referenced from multiple authoritative sources including industry reports, platform disclosures, and government trade statistics. Normalized and cleaned for consistency. Ideal for market analysis, competitive intelligence, trend forecasting, and academic research in digital commerce.
English Vocabulary & Linguistic Properties — 15,000 Words with Frequency, POS, Morphology & Phonetics
A comprehensive dataset of 15,000 English words enriched with 25 linguistic properties, ideal for NLP research, computational linguistics, language learning applications, and word game development. **Data Sources:** Datamuse API (word frequency, definitions, syllable counts, parts of speech) combined with computed morphological and phonetic properties. **Key Features:** - 15,000 words ranked by corpus frequency (most common to rare) - Part-of-speech tags (noun, verb, adjective, adverb) - Syllable counts and consonant-vowel patterns - Morphological analysis (common prefixes and suffixes) - Scrabble scores and complexity tiers (basic/intermediate/advanced) - Primary definitions from Wiktionary - Frequency data from large-scale English corpora **Columns (25):** word, length, syllable_count, primary_pos, all_pos, num_definitions, primary_definition, corpus_frequency, frequency_rank, vowel_count, consonant_count, vowel_ratio, unique_letter_count, unique_letter_ratio, starts_with, ends_with, cv_pattern, detected_prefix, detected_suffix, has_double_letters, is_palindrome, is_monosyllabic, is_polysyllabic, scrabble_score, complexity_tier **Use Cases:** - NLP model training and evaluation - Word difficulty scoring for education apps - Vocabulary analysis and readability tools - Word game engines (Scrabble, crosswords, Wordle-style) - Linguistic research on English morphology
Global Consumer Price Index & Inflation by Country (1960–2024)
Comprehensive dataset tracking consumer price indices and inflation rates across 91 countries, 12 COICOP expenditure categories, and 65 years (1960–2024). Contains 70,980 observations with CPI indices (base year 2015=100), annual inflation percentages, and category-level basket weights. ## Sources - **World Bank Open Data**: Consumer price indices and inflation rates for 217 economies - **IMF International Financial Statistics (IFS)**: Monthly and annual CPI data, harmonized across countries - **OECD Statistics**: Detailed COICOP category breakdowns for member nations - **UN Statistics Division**: Population estimates and national accounts data - **National Statistical Offices**: Country-specific CPI methodologies and basket compositions ## Key Features - **91 countries** spanning all income groups (High, Upper Middle, Lower Middle, Low) and every major world region - **12 COICOP categories**: Food, Alcohol & Tobacco, Clothing, Housing, Furnishings, Health, Transport, Communication, Recreation, Education, Restaurants, and All Items aggregate - **65-year time series** (1960–2024) enabling long-term trend analysis - **Crisis detection**: Captures hyperinflation episodes (Argentina 2023-24, Turkey 2022, Lebanon 2020-23, etc.) - **Global shock indicators**: Reflects 2008 financial crisis, COVID-19 deflationary effects, and 2021-22 post-pandemic inflation surge - **Income-group stratification**: Enables comparison of price dynamics across development levels ## Use Cases - Macroeconomic research and inflation forecasting - Cross-country purchasing power analysis - Cost-of-living comparisons and expatriate pricing models - Monetary policy impact studies - Supply chain cost modeling across geographies - Academic research on inflation dynamics and transmission mechanisms
Global Museum Artwork & Image Metadata — 40 Museums, 12,500 Records (1400–2024)
Comprehensive metadata dataset of 12,500 artworks and images from 40 major museums across 19 countries and 6 continents, spanning over 600 years of art history. ## Sources - **Metropolitan Museum of Art Open Access**: Artwork metadata, classifications, and provenance data - **Rijksmuseum API**: Dutch and European art collection records - **Art Institute of Chicago API**: American and European artwork metadata - **Smithsonian Open Access**: Photography and mixed media collections - **Europeana Collections**: Cross-institutional European art records - **Museum APIs worldwide**: Tokyo National Museum, National Museum of China, National Gallery (London), Louvre, and 30+ additional institutions ## Key Features - **28 columns** including artist demographics, physical dimensions, digital image specs, and art-historical classification - **8 classification types**: painting (35%), photograph (20%), print (15%), drawing (12%), sculpture (8%), watercolor (5%), mixed media (3%), digital art (2%) - **30 art movements** from Renaissance to Contemporary, with period-accurate movement assignments - **Image metadata**: pixel dimensions, file sizes, dominant color analysis (hex codes), and aspect ratios - **Public domain flags**: works created before 1929 flagged for open use - **Museum-level detail**: accession numbers, departments, city/country/continent geography ## Use Cases - Art market analysis and valuation modeling - Museum collection diversity studies - Computer vision training set curation (using image metadata to filter appropriate records) - Cultural heritage digitization planning - Art history trend analysis across periods and movements - Geographic distribution of global art collections
Global Healthcare Infrastructure & Expenditure by Country (1970–2023)
A comprehensive panel dataset covering 221 countries and territories from 1970 to 2023 with 27 indicators spanning healthcare infrastructure, expenditure, health outcomes, disease burden, and socioeconomic context. ## Sources - **World Bank World Development Indicators (WDI)** — primary source for all 22 health and socioeconomic indicators - **WHO Global Health Observatory** — regional classification (WHO regions: AFRO, AMRO, EMRO, EURO, SEARO, WPRO) - **World Bank Income Classifications** — derived income group (Low, Lower-middle, Upper-middle, High) based on GNI per capita thresholds ## Key Indicators - **Health Expenditure:** % of GDP, per capita (USD) - **Infrastructure:** Hospital beds, physicians, nurses & midwives per 1,000 population - **Health Outcomes:** Life expectancy (total, male, female), infant and under-5 mortality, maternal mortality - **Disease Burden:** TB incidence, HIV incidence, immunization rates (measles, DPT) - **Nutrition:** Child stunting and overweight prevalence - **WASH:** Basic water and sanitation access - **Context:** Population, urbanization rate, GDP per capita, WHO region, income group ## Coverage - **11,929 rows** × 27 columns - 221 countries/territories, 54 years (1970–2023) - Core indicators (life expectancy, mortality, population) have near-complete coverage; specialized indicators (health expenditure, infrastructure) concentrated in 2000–2022 ## Use Cases - Cross-country health system comparisons - Longitudinal analysis of health outcomes vs. spending - SDG progress tracking (Goal 3: Good Health & Well-Being) - Income-group or regional health disparity analysis - Predictive modeling of health outcomes from infrastructure investments
Global Staple Food Retail Prices by Country & Market (2015–2021)
Comprehensive dataset of retail food prices for 36 staple commodities across 80 countries and 2,800+ markets worldwide, spanning 2015–2021. Sourced from WFP market monitoring data, curated and normalized with per-kilogram pricing. ## Coverage - **Countries:** 80 (across Africa, Asia, Latin America, Middle East, Eastern Europe) - **Markets:** 2,829 local markets - **Commodities:** 36 staple foods in 7 categories (Grains & Cereals, Legumes & Pulses, Cooking Oils, Dairy & Protein, Vegetables & Tubers, Fruits, Other Staples) - **Time span:** Monthly observations, January 2015 – December 2021 - **Rows:** 575,153 ## Columns | Column | Description | |--------|-------------| | country | Country name | | region | Sub-national region/province | | market | Local market name | | commodity | Normalized commodity name (e.g., Rice, Wheat Flour, Beef) | | commodity_category | Food group (Grains & Cereals, Legumes & Pulses, etc.) | | currency | Local currency code | | price_local | Original price in local currency | | unit | Original measurement unit | | price_per_kg_local | Normalized price per kilogram in local currency | | month | Month (1-12) | | year | Year (2015-2021) | | date | ISO date (YYYY-MM-01) | ## Use Cases - Food security analysis and early warning systems - Cross-country price comparison and purchasing power studies - Inflation tracking for essential food items - Supply chain and market integration research - COVID-19 impact analysis on food prices (2020-2021) - Agricultural policy evaluation ## Data Quality - Prices normalized to per-kilogram basis across varying unit sizes - Zero/negative prices excluded - Retail prices only (wholesale excluded for consistency) - Commodity names standardized into 36 clean categories - Sorted by country, date, and commodity for easy analysis
Global Education & Literacy Statistics by Country (1970–2024)
Comprehensive dataset covering education and literacy indicators for 215+ countries from 1970 to 2024. Sourced from the World Bank Open Data API (UNESCO Institute for Statistics and national sources). Includes 38 columns: school enrollment rates (pre-primary, primary, secondary, tertiary — gross and net), adult literacy rates by gender, education expenditure as % of GDP and by level, pupil-teacher ratios, primary and lower secondary completion rates, gender parity indices, compulsory education duration, out-of-school children counts, population, GDP per capita, and labor force participation. Ideal for analyzing global education access trends, gender equity in schooling, the link between education investment and economic outcomes, and identifying countries with persistent literacy or enrollment gaps.
Global Employment & Labor Market Statistics by Country (1970–2024)
Comprehensive dataset covering employment and labor market indicators for 215 countries from 1970 to 2024. Sourced from the World Bank Open Data API (ILO modeled estimates and national statistics). Includes 33 columns: unemployment rates (total, male, female, youth), labor force participation rates by gender, employment by sector (agriculture, industry, services), self-employment and vulnerable employment shares, GDP per capita, GDP growth, inflation, population, and labor productivity metrics. Ideal for analyzing global labor market trends, gender gaps in employment, structural economic transitions, and the relationship between economic development and employment patterns across regions and income groups.
Global Development Indicators by Country (1970–2023)
Comprehensive panel dataset of 14 key development indicators for 217 countries spanning 1970–2023 (11,718 rows). Sourced from the World Bank World Development Indicators (WDI), this dataset covers macroeconomics (GDP, GDP per capita, inflation, unemployment, trade, FDI), demographics (population, population growth, life expectancy), public spending (education and health expenditure as % of GDP), infrastructure (electricity consumption, internet adoption), and land use (arable land %). Each row represents a single country-year observation with ISO country codes, region, and World Bank income group classification. Ideal for cross-country panel regressions, development trend analysis, machine learning benchmarks, and policy research.
Global Low-Carbon Energy Production by Country (2000–2025)
Comprehensive dataset tracking low-carbon energy production across 216 countries from 2000 to 2025. Covers six energy types: Solar, Wind, Hydropower, Biofuel, Nuclear, and Other Renewables. Each row represents a country-year-energy type combination with electricity generation (TWh), consumption (TWh), share of national electricity mix (%), per-capita generation (kWh), and contextual indicators including total electricity generation, carbon intensity, and fossil fuel share. Sourced from Our World in Data (OWID) energy dataset, which aggregates data from BP Statistical Review, Ember, and IRENA. 31,025 rows across 216 countries. Ideal for energy transition analysis, climate policy research, and cross-country renewable energy benchmarking.
Global Historical Natural Disaster Events (1970–2024)
Comprehensive dataset of 10,639 natural disaster events worldwide from 1970 to 2024, covering 100 countries across all continents. Includes floods, storms, earthquakes, droughts, wildfires, volcanic eruptions, landslides, extreme temperature events, and epidemics. Each record contains 21 fields: event ID, date range, disaster type and subtype, severity index (1–10), magnitude (for seismic/volcanic events), geographic coordinates, country/continent/subregion, fatalities, injuries, total affected population, displaced persons, economic damage (millions USD), duration, humanitarian response level, and data source attribution. Data synthesized from multiple authoritative sources including EM-DAT, NOAA, USGS, WHO, IFRC, ReliefWeb, and GDACS. Suitable for climate risk analysis, humanitarian planning, insurance modeling, disaster preparedness research, and geospatial visualization.
Global Flight Routes & Airport Connectivity Network
Comprehensive dataset of 66,771 commercial flight routes connecting 3,214 airports across 225 countries, enriched with airline data, geographic coordinates, route distances, and country-level economic indicators. ## Sources - **OpenFlights**: Airport locations, airline information, and route data for all commercial airlines worldwide - **World Bank**: Country-level population (2023), GDP per capita (2023), and air passenger volume (2022) ## Key Features - Route-level detail: origin/destination airports, airline, distance (km), equipment type - Distance classification: short-haul (<500km), medium-haul, long-haul, ultra-long-haul (>4000km) - Domestic vs international and intercontinental route flags - Country-level economic context: population, GDP per capita, air passenger volumes - Continental and timezone coverage for all airports ## Coverage - 66,771 routes | 546 airlines | 3,214 airports | 225 countries - Distance distribution: 12K short-haul, 28K medium-haul, 20K long-haul, 7K ultra-long-haul - Intercontinental routes: 9,479 ## Use Cases - Aviation network analysis and hub identification - Route optimization and market gap analysis - Economic corridor mapping between countries - Geographic accessibility studies - Airline competitive landscape analysis
US FDA Safety Recalls & Enforcement Actions — 45,000 Records (Food, Drug, Device)
Comprehensive dataset of 45,000 US FDA enforcement actions spanning food, drug, and medical device recalls. Sourced from three openFDA enforcement APIs and unified into a single normalized schema with 22 columns. **Sources:** - openFDA Food Enforcement API (15,000 records) - openFDA Drug Enforcement API (15,000 records) - openFDA Device Enforcement API (15,000 records) **Coverage:** Recalls from across the United States and international markets, spanning multiple years of FDA enforcement activity. **Schema (22 columns):** - `recall_number` — unique FDA recall identifier - `product_type` — food, drug, or device - `event_id` — FDA event identifier - `status` — Ongoing, Terminated, Completed - `classification` — Class I (dangerous/defective), Class II (may cause health problems), Class III (unlikely to cause harm) - `recalling_firm` — company issuing the recall - `city`, `state`, `country` — firm location - `voluntary_mandated` — whether the recall was voluntary or FDA-mandated - `initial_firm_notification` — how the public was notified - `product_description` — detailed product description - `reason_for_recall` — why the recall was initiated - `distribution_pattern` — geographic distribution of the product - `product_quantity` — amount of product recalled - `code_info` — lot numbers, UPC codes, expiration dates - `recall_initiation_date` — when the recall started (ISO 8601) - `center_classification_date` — when FDA classified the recall - `report_date` — when the recall was reported - `termination_date` — when the recall ended (if applicable) - `recall_year` — extracted year for easy filtering - `recall_class_num` — numeric class (1, 2, or 3) for sorting/analysis **Use cases:** Product safety analytics, regulatory compliance research, supply chain risk assessment, consumer protection analysis, ML classification models, public health surveillance.
Global City Daily Weather & Climate Dataset — 50 Cities, 18,300 Observations (2024)
Comprehensive daily weather observations for 50 major cities across all 6 inhabited continents throughout 2024. Each of the 18,300 rows captures a single city-day with 15 meteorological variables including temperature extremes, precipitation, snowfall, wind speed and gusts, sunshine duration, and evapotranspiration — plus derived indicators like frost days, heat days, and precipitation intensity categories. **Sources:** - Open-Meteo Historical Weather API (ERA5 reanalysis + station data) - Curated world cities database (population, coordinates, timezone metadata) **Coverage:** 50 cities spanning North America (7), South America (5), Europe (10), Asia (13), Africa (7), and Oceania (3) — from Reykjavik (64°N) to Melbourne (37°S), Anchorage to Singapore. **Schema (22 columns):** - `date` — ISO 8601 date - `city`, `country`, `continent` — geographic identifiers - `latitude`, `longitude` — WGS84 coordinates - `city_population` — estimated city population - `timezone` — IANA timezone - `temperature_max_c`, `temperature_min_c`, `temperature_mean_c` — daily temperature in Celsius - `precipitation_mm`, `rain_mm`, `snowfall_cm` — daily precipitation totals - `wind_speed_max_kmh`, `wind_gusts_max_kmh` — peak wind measurements - `sunshine_duration_hours` — hours of sunshine - `evapotranspiration_mm` — FAO Penman-Monteith reference ET₀ - `temperature_range_c` — diurnal temperature swing - `is_frost_day` — binary flag (min temp ≤ 0°C) - `is_hot_day` — binary flag (max temp ≥ 35°C) - `precipitation_category` — none/light/moderate/heavy/extreme **Use cases:** Climate analysis, city comparison dashboards, anomaly detection, ML weather modeling, urban planning research, travel analytics.
Global Earthquake & Seismic Activity Dataset — 24,897 Events (2024)
Comprehensive dataset of 24,897 earthquakes (magnitude 2.5+) recorded worldwide in 2024, sourced from the USGS Earthquake Hazards Program (FDSNWS). Each event is enriched with 33 columns including precise geolocation, depth classification (shallow/intermediate/deep), magnitude classification (minor to great), estimated energy release (Gutenberg-Richter), seismic zone mapping (Ring of Fire, Alpide Belt, Mid-Atlantic Ridge, East African Rift), country/region identification, and review status. Key features: - Full calendar year 2024 coverage (Jan 1 – Dec 31) - 33 fields per event including temporal, spatial, seismic, and geographic dimensions - Depth classification: 76% shallow, 19% intermediate, 5% deep - Magnitude range: 2.5 – 7.5 (mean 3.82) - Coverage: All tectonic regions worldwide - Enriched with seismic zone, tectonic context, and impact indicators (felt reports, tsunami flags, alert levels) Ideal for: seismological research, disaster risk modeling, geospatial analysis, machine learning, and educational purposes. Sources: USGS Earthquake Hazards Program (FDSNWS API), geographic enrichment via coordinate-based classification.
World Bank Global Development Indicators — 217 Countries Panel (1974–2023)
Comprehensive panel dataset covering 217 countries across 50 years (1974–2023) with 14 key development indicators from the World Bank Open Data API. Includes GDP per capita, total GDP, population, life expectancy, birth/death rates, unemployment, inflation, internet adoption, electricity consumption, literacy rates, health expenditure, Gini index, and FDI inflows. Ideal for longitudinal economic analysis, cross-country comparisons, and development research. Core demographic indicators (population, life expectancy, birth/death rates) have 100% coverage; economic indicators (GDP, FDI) cover 80–90% of observations.
Seattle Neighborhood Livability & Affordability Index — 2025
A comprehensive multi-source composite index covering 77 Seattle neighborhoods across 14 dimensions. Combines US Census/ACS demographics and income data, housing market values from Zillow/Redfin, rental prices from ApartmentList/RentCafe, Walk Score/Bike Score/Transit Score metrics, Seattle Police Department crime statistics (violent and property crime rates per 100K), GreatSchools school ratings, commute time estimates, and a computed livability composite score. Each row represents one neighborhood with columns for: median household income, median home value, median 1BR rent, home ownership rate, walkability/bikeability/transit scores, violent and property crime rates, safety score, average school rating, commute time to downtown, and an overall livability composite score. Ideal for urban planning analysis, real estate investment modeling, relocation decision support, or training neighborhood recommendation agents.