Data Science in the Real World

How to create a hourly updating Covid-19 Dataset under 5 minutes.

Scraper.AI
4 min readJun 25, 2020

Dataset creation using no-code tools

There are so many projects you could create from having all Covid-19 data available.

For example, I could create a website that keeps a history of the infections, deaths and recovered persons all over the world. Which would have a fancy graph and a lot of extra information that i can show to the website visitor.

Creating such a website would require one common part, one that’s crucial for any project i want to get started with… Data. Yes, Data is the new gold and it’s all around us. It’s up to us to display it well.

In order to get access to data, we could use a tool. There are a lot out there, but in this example we’ll use https://scraper.ai

Creating the Dataset

Given any project, the creation of a reliable Dataset is the most time consuming, not only do we have to get the data, we also have to maintain it. In the case of Covid, updated statistics get updated Daily. Because we are with timezones, and not every country publishes their statistics at the same time, it’s worth scraping at least every hour! Luckily scraper.ai takes care of all this so we only have to worry about selecting some fields.

There are many websites giving the corona statistics, we could take several governmental ones and aggregate them. But to keep it rather simple, we’ll use https://www.worldometers.info/coronavirus/ which provides us with an overview of corona activity world wide.

The scraping

To get started, head over to the website and open the extension, next click on the “Select Element” to start selecting what we’re interested in.

Give each column a name, until you have highlighted all data points that we want.

In our cases this will be Country, Total Cases, New Cases, Total Deaths, New Deaths, Total Recovered, Active Cases, Serious, Critical, Total Tests and Population

Your view should look something like the image below, a colorful table of selections :) Proceed by clicking next.

n the next screen we wanna make sure to set interval to “Hourly” to have statistics rolling in each hour! Otherwise we might miss on updates.

After clicking “Finish” the data will be scraped and put available through the scraper.ai website. Congratulations, you’ve now created your dataset!

When we head to “Overview”, you can see that there is a section called “API URL”. This is another way of working with this data. When you’ll open that link, you’ll be able to download it as a .json file.

To visualize it using, for example, Excel. We can convert this file into a csv or excel format using free tools like https://www.convertcsv.com/json-to-csv.htm

Summary

We’ve shown how to easily create a dataset that is updated hourly. Creating Datasets with a tool like https://scraper.ai makes simple and more cost-efficient than programming and hosting it yourself for example.

In a next section we’ll show how you can use this data in Tableau or Power.BI to get most out of the data

What would you like to see next? Let us know in the comments below.

Hey 👋, My name is Maxim from https://scraper.ai. Let us know what you think in the comments below

--

--

Scraper.AI

Extracting data from the web is hard. We Automate Website data extraction for you!