Collecting Web Data
Web scraping is a technique for extracting information from websites. This can be done manually but it is usually faster, more efficient and less error-prone if it can be automated.
Web scraping allows you to convert non-tabular or poorly structured data into a usable, structured format, such as a .csv file or spreadsheet. But scraping is about more than just acquiring data: it can help you track changes to data online, and help you archive data. In short, it's a skill worth learning.
So join us for this web scraping workshop to learn web scraping, using the researcher-focused training modules from the highly regarded Software Carpentry Foundation.
You'll learn:
- The concept of structured data
- The use of XPath queries on HTML document
- How to scrape data using browser extensions
- How to scrape using Python and Scrapy
- How to automate the scraping of multiple web pages
Prerequisites:
A good knowledge of the basic concepts and techniques in Python. Consider taking our Learn to Program: Python and Python for Research courses to come up to speed beforehand.
For more information, please click here.
Licence: Creative Commons Attribution 4.0
Contact: training@intersect.org.au
Keywords: Data Management, Python
Additional information
Status: Active
Collecting Web Data
https://intersect.org.au/training/course/webdata201
https://dresa.org.au/materials/collecting-web-data
Web scraping is a technique for extracting information from websites. This can be done manually but it is usually faster, more efficient and less error-prone if it can be automated.
Web scraping allows you to convert non-tabular or poorly structured data into a usable, structured format, such as a .csv file or spreadsheet. But scraping is about more than just acquiring data: it can help you track changes to data online, and help you archive data. In short, it's a skill worth learning.
So join us for this web scraping workshop to learn web scraping, using the researcher-focused training modules from the highly regarded Software Carpentry Foundation.
#### You'll learn:
- The concept of structured data
- The use of XPath queries on HTML document
- How to scrape data using browser extensions
- How to scrape using Python and Scrapy
- How to automate the scraping of multiple web pages
#### Prerequisites:
A good knowledge of the basic concepts and techniques in Python. Consider taking our [Learn to Program: Python](https://intersect.org.au/training/course/python101/) and [Python for Research](https://intersect.org.au/training/course/python110/) courses to come up to speed beforehand.
**For more information, please click [here](https://intersect.org.au/training/course/webdata201).**
training@intersect.org.au
Data Management, Python