COVID-19 dataset clearinghouse: Difference between revisions

Revision as of 12:42, 27 March 2020

This is a repository for public data sets relating to the COVID-19 pandemic. It was also initially envisioned as a clearinghouse for matching requests for data cleaning of such datasets with volunteers willing to perform this clearing, but the existing clearinghouse at United against COVID-19 is already up and running for this purpose, so we are redirecting such requests to that site in order not to fragment the pools of requests and volunteers.

For discussion of this project, see this blog post.

Data sets

Literature

LitCovid - a curated literature hub for tracking up-to-date scientific information about the 2019 novel Coronavirus
COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv

Medical imagery

COVID-19 Detection X-Ray Dataset, Kaggle
COVID-19: casistica radiologica Italiana, Società Italiana di Radiologia Medica e Interventistica

Other data

Aggregated foot traffic data, Safegraph
- Needs non-commercial agreement to execute.
- Sample visualization of Safegraph data
COVID Care Map
- Open geospatial work to support health systems' capacity (providers, supplies, ventilators, beds, meds) to effectively care for rapidly growing COVID19 patient needs
- Open map data on US health system capacity to care for COVID-19 patients
Covid-19 Twitter chatter dataset for scientific use, Panacea Lab, Georgia State University

Data scrapers and aggregators

Visualizations and summaries

COVID-19 Coronavirus Pandemic, Worldometer
Tracking coronavirus: Map, data and timeline, BNO News
Coronavirus COVID-19 Global Cases, JHU CSSE
Infection2020
covy.app
COVID-19 Global Pandemic Real-Time report, dxy.cn (English version)
Coronavirus tracked: the latest figures as the pandemic spreads, Financial Times
COVID-19 - official Indian government site
COVID-19 - Analysis, Visualization & Comparisons, Kaggle

Other lists

COVID-19 data sets, Kaggle
Reddit thread collecting coronavirus datasets
Review of COVID-19 APIs, Wendell Santos
NPGEO Corona Hub 2020, Nationale Plattform für geografische Daten (NPGEO)
Data sets for COVID, Wolfram Data Repository
COVID-19 Data Hub, Tableau

Data or Data cleaning requests

As mentioned at the top of this page, future requests for data or data cleaning should be directed to this data discourse page at United Against COVID-19. Below are the legacy requests of this project prior to this redirect.

From Chris Strohmeier (UCLA), Mar 25

The biorxiv_medrxiv file at https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge contains another folder titled biorxiv_medrxiv, which in turn contains hundreds of json files. Each file corresponds to a research article, at least tangentially related to COVID-19.

We are requesting:

A tf-idf matrix associated to the subset of the above collection which contain full-text articles (some appear to only have abstracts).
The rows should correspond to the (e.g. 5000) most commonly used words.
The columns should correspond to each individual json file.
The clean data should be stored as a npy or mat file (or both).
Finally, there should be a csv or text document (or both) explaining the meaning of the individual rows and columns of the matrix (what words do the rows correspond to? What file does each column correspond to).

Contact: c.strohmeier@math.ucla.edu

From Juan José Piñero de Armas (U. Católica de Murcia), Mar 27

We request information (on a person basis) to perform survival analyses, regressions with random effects, etc. Some data exists for instance at

https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset/data https://www.kaggle.com/kimjihoo/coronavirusdataset https://www.kaggle.com/imdevskp/covid-19-analysis-visualization-comparisons/data https://www.sirm.org/category/senza-categoria/covid-19/

but we need much more detail (date when each person was diagnosed, date of infection for the same person, discharge date, date of death, gender, age, treatments, temperatures...) not just summaries or country-aggregated data.

Contact: jjpinero@ucam.edu

@@ Line 108: / Line 108: @@
 * [https://www.programmableweb.com/news/apis-to-track-coronavirus-covid-19/review/2020/03/18 Review of COVID-19 APIs], Wendell Santos
 * [https://npgeo-corona-npgeo-de.hub.arcgis.com/ NPGEO Corona Hub 2020], Nationale Plattform für geografische Daten (NPGEO)
+* [https://datarepository.wolframcloud.com/search?i=COVID Data sets for COVID], Wolfram Data Repository
+* [https://www.tableau.com/covid-19-coronavirus-data-resources COVID-19 Data Hub], Tableau
 == Data or Data cleaning requests ==

COVID-19 dataset clearinghouse: Difference between revisions

Revision as of 12:42, 27 March 2020

Contents

Data sets

Epidemiology

North America

Europe

Asia

Other regional data

Genomics and homology

Literature

Medical imagery

Other data

Data scrapers and aggregators

Visualizations and summaries

Other lists

Data or Data cleaning requests

From Chris Strohmeier (UCLA), Mar 25

From Juan José Piñero de Armas (U. Católica de Murcia), Mar 27

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools