A quick guide on web scraping: Why and how

Web scraping, which is the collection and cleaning of online data, is the first step in any data-driven project. Here’s a short video that explains what scraping is, and how to create automated scraping jobs using a digital tool.

This is a 15-minute video created by an instructor at Ohio State University. In the first six minutes, the instructor talks about why we need web scraping; he then shows how to use a scraping tool, OutWit Hub, to collect data scattered in a large database.

FYI: read reviews by Reporters’ Lab of OutWit Hub and other web scraping tools.

Related posts:

About Mu Lin

Dr. Mu Lin is a digital journalism professional and educator in New Jersey, United States. Dr. Lin manages an online marketing company. He also manages MulinBlog Online J-School (www.mulinblog.com/mooc), a free online journalism training program, which offers courses such as Audio Slideshow Storytelling; Introduction to Social Media Marketing; Writing for the Web; Google Mapping for Communicators; Introduction to Data Visualization; Introduction to Web Metrics and Google Analytics.
This entry was posted in Data journalism. Bookmark the permalink.

5 Responses to A quick guide on web scraping: Why and how

  1. joshtboswell says:

    Hi, Do you have any plans for a web scraping MOOC? I (and I imagine many other people) would be really interested to take part and learn more.

    Really enjoy your blog, thank you for all the work you do!

  2. Nice post about web scraping. Thanks for sharing.

  3. iwebscraping says:

    Awesome post on Web Scraping, it clears all the doubt why we need web scraping and the explanation how to use scraping tool is also very useful.

  4. Again, Very Nice post on Scraping. Thanks for sharing such useful information.

Leave a Reply