JHU CSSE COVID-19 Dataset

Table of contents

Daily reports (csse_covid_19_daily_reports)

This folder contains daily case reports. All timestamps are in UTC (GMT+0).

File naming convention

MM-DD-YYYY.csv in UTC.

Field description

Update frequency

Data sources

Refer to the mainpage.

Why create this new folder?

  1. Unifying all timestamps to UTC, including the file name and the "Last Update" field.
  2. Pushing only one file every day.
  3. All historic data is archived in archived_data.

USA daily state reports (csse_covid_19_daily_reports_us)

This table contains an aggregation of each USA State level data.

File naming convention

MM-DD-YYYY.csv in UTC.

Field description

Update frequency

Data sources

Refer to the mainpage.


Time series summary (csse_covid_19_time_series)

See here.


Data modification records

This section will contain any modifications to our datasets as well as the reason for the change. If the error results from an issue on our collection of the data, the error will be listed in the errata.csv in the csse_covid19_time_series folder. If the error results due to a change from the source, the change and reasoning will be listed below. Generalized Format: Date: Location | Change | Files affected | Reason/Other notes | Source

Retrospective reporting of (probable) cases and deaths

This section reports instances where large numbers of historical cases or deaths have been reported on a single day. These reports cause anomalous spikes in our time series curves. When available, we liaise with the appropriate health department and distribute the cases or deaths back over the time series. If these are successful, they will be reported in the below section titled "Large Scale Back Distributions". A large proportion of these spikes are due to the release of probable cases or deaths.

Generalized Format: Date: Location | Change | Reason/Other notes | Source

Large-scale back distributions

This section will serve to notify developers when we are able to successfully backdistribute any of the large instances of retrospective reporting. Generalized format: Date: Location | File | Change | Data source for change

Irregular Update Schedules

As the pandemic has progressed, several locations have altered their reporting schedules to no longer provide daily updates. As these locations are identified, we will list them in this section of the README. We anticipate that these irregular updates will cause cyclical spikes in the data and smoothing algorithms should be applied if the data is to be used for modeling.

United States

International

For all international locations, compositing with the reporting of the World Health Organization we may update more frequently than the national sources.

Upcoming Irregular Update Schedules

United Kingdom (England, Scotland, Wales, Northern Ireland): Providing data once weekly (Wednesdays) as of July 1, 2022.


UID Lookup Table Logic

  1. All countries without dependencies (entries with only Admin0).
    • None cruise ship Admin0: UID = code3. (e.g., Afghanistan, UID = code3 = 4)
    • Cruise ships in Admin0: Diamond Princess UID = 9999, MS Zaandam UID = 8888.
  2. All countries with only state-level dependencies (entries with Admin0 and Admin1).
    • Demark, France, Netherlands: mother countries and their dependencies have different code3, therefore UID = code 3. (e.g., Faroe Islands, Denmark, UID = code3 = 234; Denmark UID = 208)
    • United Kingdom: the mother country and dependencies have different code3s, therefore UID = code 3. One exception: Channel Islands is using the same code3 as the mother country (826), and its artificial UID = 8261.
    • Australia: alphabetically ordered all states, and their UIDs are from 3601 to 3608. Australia itself is 36.
    • Canada: alphabetically ordered all provinces (including cruise ships and recovered entry), and their UIDs are from 12401 to 12415. Canada itself is 124.
    • China: alphabetically ordered all provinces, and their UIDs are from 15601 to 15631. China itself is 156. Hong Kong, Macau and Taiwan have their own code3.
    • Germany: alphabetically ordered all admin1 regions (including Unknown), and their UIDs are from 27601 to 27617. Germany itself is 276.
    • Italy: UIDs are combined country code (380) with codice_regione, which is from Dati COVID-19 Italia. Exceptions: P.A. Bolzano is 38041 and P.A. Trento is 38042.
  3. The US (most entries with Admin0, Admin1 and Admin2).
    • US by itself is 840 (UID = code3).
    • US dependencies, American Samoa, Guam, Northern Mariana Islands, Virgin Islands and Puerto Rico, UID = code3. Their Admin0 FIPS codes are different from code3.
    • US states: UID = 840 (country code3) + 000XX (state FIPS code). Ranging from 8400001 to 84000056.
    • Out of [State], US: UID = 840 (country code3) + 800XX (state FIPS code). Ranging from 8408001 to 84080056.
    • Unassigned, US: UID = 840 (country code3) + 900XX (state FIPS code). Ranging from 8409001 to 84090056.
    • US counties: UID = 840 (country code3) + XXXXX (5-digit FIPS code).
    • Exception type 1, such as recovered and Kansas City, ranging from 8407001 to 8407999.
    • Exception type 2, Bristol Bay plus Lake Peninsula replaces Bristol Bay and its FIPS code. Population is 836 (Bristol Bay) + 1,592 (Lake and Peninsula) = 2,428 (Bristol Bay plus Lake Peninsula). 2148 (Hoonah-Angoon) + 579 (Yakutat) = 2727 (Yakutat plus Hoonah-Angoon). UID is 84002282, the same as Yakutat. New York City replaces New York County and its FIPS code. New York City popluation is calculated as Bronx (1,418,207) + Kings (2,559,903) + New York (1,628,706) + Queens (2,253,858) + Richmond (476,143) = NYC (8,336,817). (updated on Aug 31)
    • Exception type 3, Diamond Princess, US: 84088888; Grand Princess, US: 84099999.
    • Exception type 4, municipalities in Puerto Rico are regarded as counties with FIPS codes. The FIPS code for the unassigned category is defined as 72999.
  4. Population data sources.
    • United Nations, Department of Economic and Social Affairs, Population Division (2019). World Population Prospects 2019, Online Edition. Rev. 1. https://population.un.org/wpp/Download/Standard/Population/
    • eurostat: https://ec.europa.eu/eurostat/web/products-datasets/product?code=tgs00096
    • The U.S. Census Bureau: https://www.census.gov/data/datasets/time-series/demo/popest/2010s-counties-total.html
    • Mexico population 2020 projection: Proyecciones de población/demanda/poblacion_proyecciones.aspx?AspxAutoDetectCookieSupport=1)
    • Brazil 2019 projection: ftp://ftp.ibge.gov.br/Estimativas_de_Populacao/Estimativas_2019/
    • Peru 2020 projection: https://www.citypopulation.de/en/peru/cities/
    • India 2019 population: http://statisticstimes.com/demographics/population-of-indian-states.php
    • Belgium (Population on 1st January 2020): https://statbel.fgov.be/en/themes/population/structure-population
    • Canada (Q3 2021): https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1710000901
    • China (mainland) (May 11, 2021): http://www.stats.gov.cn/english/PressRelease/202105/t20210510_1817188.html
    • Denmark (mainland) (2021Q3): https://www.dst.dk/en/Statistik/emner/borgere/befolkning
    • France (European) (January 1st 2021): https://www.ined.fr/en/everything_about_population/data/france/population-structure/regions_departments/
    • Germany (as of 31.12.2020): https://www.destatis.de/EN/Themes/Society-Environment/Population/Current-Population/Tables/population-by-laender.html
    • The Admin0 level population could be different from the sum of Admin1 level population since they may be from different sources.

Disclaimer: *The names of locations included on the Website correspond with the official designations used by the U.S. Department of State. The presentation of material therein does not imply the expression of any opinion whatsoever on the part of JHU concerning the legal status of any country, area or territory or of its authorities. The depiction and use of boundaries, geographic names and related data shown on maps and included in lists, tables, documents, and databases on this website are not warranted to be error free nor do they necessarily imply official endorsement or acceptance by JHU.