A quick guide on web scraping: Why and how

Web scraping, which is the collection and cleaning of online data, is the first step in any data-driven project. Here’s a short video that explains what scraping is, and how to create automated scraping jobs using a digital tool.

This is a 15-minute video created by an instructor at Ohio State University. In the first six minutes, the instructor talks about why we need web scraping; he then shows how to use a scraping tool, OutWit Hub, to collect data scattered in a large database.

FYI: read reviews by Reporters’ Lab of OutWit Hub and other web scraping tools.

Related posts:

About Mu Lin

Dr. Mu Lin is a digital journalism professional and educator in New Jersey, United States. Dr. Lin manages an online marketing company. He also manages MulinBlog Online J-School (www.mulinblog.com/mooc), a free online journalism training program, which offers courses such as Audio Slideshow Storytelling; Introduction to Social Media Marketing; Writing for the Web; Google Mapping for Communicators; Introduction to Data Visualization; Introduction to Web Metrics and Google Analytics.
This entry was posted in Data journalism. Bookmark the permalink.

5 Responses to A quick guide on web scraping: Why and how

Leave a Reply