[QCL Workshop] Web Scraping with Python (Level 2 Data + Coding)

[QCL Workshop] Web Scraping with Python (Level 2 Data + Coding)

# Web Scraping with Python (Level 2 Data + Coding)

By Jeho Park

Date and time

Friday, October 2, 2020 · 10am - 12pm PDT

Location

Online

About this event

## Summary

In this 2-hour workshop, you will learn a way to collect data from web pages such as Wikipedia using web scraping functions and data manipulation packages in Python. 

Learning objectives of the workshop:

  • Understanding Robots.txt and HTTP requests.
  • Understanding basic components of a webpage and HTML.
  • Get familiar with Pandas Module.
  • Parsing html string into Pandas.
  • Parse URL class into Pandas.
  • Parse Tables from Wikipedia into Pandas.
  • Parse non-Wikipedia Tables into Pandas.
  • Parse Wiki InfoBoxes.
  • Write html parsed tables into flat csv.
  • Advanced understanding of HTML parsing using tagging and CSS selection.

## Date and Time

October 2, 2020, from 10am to 12pm

## Location

Online

## Pre-requisites

Internet Use: Introductory level (search, log-in, navigation of websites, etc.)

Programming: Basic Python programming skills (functions, packages, etc.)

## Participants

CMC Students, Faculty and Staff

Organized by

Director, Murty Sunak Quantitative and Computing Lab

Sales Ended