Cleaning Data with Open Refine
Do you have messy data from multiple inconsistent sources, or open-responses to questionnaires? Do you want to improve the quality of your data by refining it and using the power of the internet?
Open Refine is the perfect partner to Excel. It is a powerful, free tool for exploring,...
Cleaning Data with Open Refine
https://intersect.org.au/training/course/refine101
https://dresa.org.au/materials/cleaning-data-with-open-refine
Do you have messy data from multiple inconsistent sources, or open-responses to questionnaires? Do you want to improve the quality of your data by refining it and using the power of the internet?
Open Refine is the perfect partner to Excel. It is a powerful, free tool for exploring, normalising and cleaning datasets, and extending data by accessing the internet through APIs. In this course we’ll work through the various features of Refine, including importing data, faceting, clustering, and calling remote APIs, by working on a fictional but plausible humanities research project.
Download, install and run Open Refine
Import data from csv, text or online sources and create projects
Navigate data using the Open Refine interface
Explore data by using facets
Clean data using clustering
Parse data using GREL syntax
Extend data using Application Programming Interfaces (APIs)
Export project for use in other applications
The course has no prerequisites.
training@intersect.org.au
Intersect Australia
Open Refine
Mastering text with Regular Expressions
Have you ever wanted to extract phone numbers out of a block of unstructured text? Or email addresses. Or find all the words that start with “e” and end with “ed”, no matter their length? Or search through DNA sequences for a pattern? Or extract coordinates from GPS data?
Regular...
Keywords: Regular Expressions
Mastering text with Regular Expressions
https://intersect.org.au/training/course/regex101
https://dresa.org.au/materials/mastering-text-with-regular-expressions
Have you ever wanted to extract phone numbers out of a block of unstructured text? Or email addresses. Or find all the words that start with “e” and end with “ed”, no matter their length? Or search through DNA sequences for a pattern? Or extract coordinates from GPS data?
Regular Expressions (regexes) are a powerful way to handle a multitude of different types of data. They can be used to find patterns in text and make sophisticated replacements. Think of them as find and replace on steroids. Come along to this workshop to learn what they can do and how to apply them to your research.
Comprehend and apply the syntax of regular expressions
Use the http://regexr.com tool to test a regular expression against some text
Construct simple regular expressions to find capitalised words; all numbers; all words that start with a specific set of letters, etc. in a block of text
Craft and test a progressively more complex regular expression
Find helpful resources covering regular expressions on the web
Comprehend and apply the syntax of regular expressions
Use the http://regexr.com tool to test a regular expression against some text
Construct simple regular expressions to find capitalised words; all numbers; all words that start with a specific set of letters, etc. in a block of text
Craft and test a progressively more complex regular expression
Find helpful resources covering regular expressions on the web
training@intersect.org.au
Intersect Australia
Regular Expressions
Regular Expressions on the Command Line
Would you like to use regular expressions with the classic command line utilities find, grep, sed and awk? These venerable Unix utilities allow you to search, filter and transform large amounts of text (including many common data formats) efficiently and repeatably.
find to locate files and...
Keywords: Regular Expressions
Regular Expressions on the Command Line
https://intersect.org.au/training/course/regex201
https://dresa.org.au/materials/regular-expressions-on-the-command-line
Would you like to use regular expressions with the classic command line utilities find, grep, sed and awk? These venerable Unix utilities allow you to search, filter and transform large amounts of text (including many common data formats) efficiently and repeatably.
find to locate files and directories matching regexes.
grep to filter lines in files based on pattern matches.
sed to find and replace using regular expressions and captures.
awk to work with row- and column-oriented data.
This course assumes prior knowledge of the basic syntax of regular expressions. If you’re new to regular expressions or would like a refresher, take our Mastering text with Regular Expressions course first.
This course also assumes basic familiarity with the Bash command line environment found on GNU/Linux and other Unix-like environments. Take our Unix Shell and Command Line Basics course to get up to speed quickly.
training@intersect.org.au
Intersect Australia
Regular Expressions
From PC to Cloud or High Performance Computing
Most of you would have heard of Cloud and High Performance Computing (HPC), or you may already be using it. HPC is not the same as cloud computing. Both technologies differ in a number of ways, and have some similarities as well.
We may refer to both types as “large scale computing” – but...
From PC to Cloud or High Performance Computing
https://intersect.org.au/training/course/compute001
https://dresa.org.au/materials/from-pc-to-cloud-or-high-performance-computing
Most of you would have heard of Cloud and High Performance Computing (HPC), or you may already be using it. HPC is not the same as cloud computing. Both technologies differ in a number of ways, and have some similarities as well.
We may refer to both types as “large scale computing” – but what is the difference? Both systems target scalability of computing, but in different ways.
This webinar will give a good overview to the researchers thinking to make a move from their local computer to Cloud of High Performance Computing Cluster.
Introduction
HPC vs Cloud computing
When to use HPC
When to use the Cloud
The Cloud – Pros and Cons
HPC – Pros and Cons
The webinar has no prerequisites.
training@intersect.org.au
Intersect Australia
HPC
Thinking like a computer: The Fundamentals of Programming
Human brains are extremely good at evaluating a small amount of information simultaneously, ignoring anomalies and coming up with an answer to a problem without much in the way of conscious thought. Computers on the other hand are extremely good at performing individual calculations, one at a...
Thinking like a computer: The Fundamentals of Programming
https://intersect.org.au/training/course/coding003
https://dresa.org.au/materials/thinking-like-a-computer-the-fundamentals-of-programming
Human brains are extremely good at evaluating a small amount of information simultaneously, ignoring anomalies and coming up with an answer to a problem without much in the way of conscious thought. Computers on the other hand are extremely good at performing individual calculations, one at a time, and can keep the results in a large bank of short-term memory for quick recall. These two approaches are fundamentally different.
Humans can only reasonably retain seven plus or minus two pieces of information in short-term memory, and new items push older items out, whereas a computer is hopeless when given multiple pieces of information simultaneously.
Understanding this fact is key to being able to write instructions for computers – also known as programs – in a way that takes advantage of their strengths, and overcomes their drawbacks.
Suitable for the programming novice, this webinar is good preparation for researchers wanting to learn how to program.
How a human solves tasks
How a computer solves tasks
Overview of programming concepts:
Variables
Loops
Conditionals
Functions
Data types
The webinar has no prerequisites.
training@intersect.org.au
Intersect Australia
Python
Parallel Programming for HPC
You have written, compiled and run functioning programs in C and/or Fortran. You know how HPC works and you’ve submitted batch jobs.
Now you want to move from writing single-threaded programs into the parallel programming paradigm, so you can truly harness the full power of High Performance...
Parallel Programming for HPC
https://intersect.org.au/training/course/hpc301
https://dresa.org.au/materials/parallel-programming-for-hpc
You have written, compiled and run functioning programs in C and/or Fortran. You know how HPC works and you’ve submitted batch jobs.
Now you want to move from writing single-threaded programs into the parallel programming paradigm, so you can truly harness the full power of High Performance Computing.
OpenMP (Open Multi-Processing): a widespread method for shared memory programming
MPI (Message Passing Interface): a leading distributed memory programming model
To do this course you need to have:
A good working knowledge of HPC. Consider taking our
Getting Started with HPC using PBS Pro course to come up to speed beforehand.
Prior experience of writing programs in either C or Fortran.
training@intersect.org.au
Intersect Australia
HPC
Start Coding without Hesitation: Programming Languages Showdown
Programming is becoming more and more popular, with many researchers using programming to perform data cleaning, data manipulation, data analytics, as well as creating publication quality plots. Programming can be really beneficial for automating processes and workflows. In this webinar, we are...
Keywords: Python, R, Matlab, Julia
Start Coding without Hesitation: Programming Languages Showdown
https://intersect.org.au/training/course/coding001
https://dresa.org.au/materials/start-coding-without-hesitation-programming-languages-showdown
Programming is becoming more and more popular, with many researchers using programming to perform data cleaning, data manipulation, data analytics, as well as creating publication quality plots. Programming can be really beneficial for automating processes and workflows. In this webinar, we are exploring four of the most popular programming languages that are widely used in academia, namely Python, R, MATLAB, and Julia.
Why use Programming
An overview of Python, R, MATLAB, and Julia
Code comparison of the four programming languages
Popularity and job opportunities
Intersect’s comparison
General guidelines on how to choose the best programming language for your research
The webinar has no prerequisites.
training@intersect.org.au
Intersect Australia
Python, R, Matlab, Julia
Getting Started with Excel
We rarely receive the research data in an appropriate form. Often data is messy. Sometimes it is incomplete. And sometimes there’s too much of it. Frequently, it has errors.
This webinar targets beginners and presents a quick demonstration of using the most widespread data wrangling tool,...
Getting Started with Excel
https://intersect.org.au/training/course/excel001
https://dresa.org.au/materials/getting-started-with-excel
We rarely receive the research data in an appropriate form. Often data is messy. Sometimes it is incomplete. And sometimes there’s too much of it. Frequently, it has errors.
This webinar targets beginners and presents a quick demonstration of using the most widespread data wrangling tool, Microsoft Excel, to sort, filter, copy, protect, transform, aggregate, summarise, and visualise research data.
Introduction to Microsoft Excel user interface
Interpret data using sorting, filtering, and conditional formatting
Summarise data using functions
Analyse data using pivot tables
Manipulate and visualise data
Handy tips to speed up your work
The webinar has no prerequisites.
training@intersect.org.au
Intersect Australia
Excel
Survey Tools in Research: REDCap and Qualtrics
Now more than ever researchers are needing to embrace electronic data capture methods to keep their research moving in the midst of social distancing restrictions and decreased access to survey participants. Using a research specific survey tool can not only solve this problem, but also set your...
Keywords: REDCap, Qualtrics
Survey Tools in Research: REDCap and Qualtrics
https://intersect.org.au/training/course/surveys001
https://dresa.org.au/materials/survey-tools-in-research-redcap-and-qualtrics
Now more than ever researchers are needing to embrace electronic data capture methods to keep their research moving in the midst of social distancing restrictions and decreased access to survey participants. Using a research specific survey tool can not only solve this problem, but also set your research up for success through intuitive data collection and validation, scheduling and reporting.
This webinar will introduce and compare two of the most popular research tools for the collection of survey data and patient records: REDCap and Qualtrics.
Electronic Data Capture: Surveys vs Forms
Confidential vs Anonymous data collection
Strengths and weaknesses of Qualtrics and REDCap
Real-life use cases for each tool
Using survey tools for longitudinal studies
The webinar has no prerequisites.
training@intersect.org.au
Intersect Australia
REDCap, Qualtrics
A showcase of Data Analysis in Python and R: A case study using COVID-19 data
In all fields of research we are being confronted with a deluge of data; data that needs cleaning and transformation to be used in further analysis. This webinar demonstrates the effective use of programming tools for an initial analysis of COVID-19 datasets, with examples using both R and...
A showcase of Data Analysis in Python and R: A case study using COVID-19 data
https://intersect.org.au/training/course/coding002
https://dresa.org.au/materials/a-showcase-of-data-analysis-in-python-and-r-a-case-study-using-covid-19-data
In all fields of research we are being confronted with a deluge of data; data that needs cleaning and transformation to be used in further analysis. This webinar demonstrates the effective use of programming tools for an initial analysis of COVID-19 datasets, with examples using both R and Python.
Cleaning up a dataset for analysis
Using Jupyter lab for interactive analysis
Making the most of the tidyverse (R) and pandas (python)
Simple data visualisation using ggplot (R) and seaborn (python)
Best practices for readable code
The webinar has no prerequisites.
training@intersect.org.au
Intersect Australia
Python, R
Learn to Program: Julia
Julia is a high-level, high-performance dynamic programming language with more than 4,000 external libraries available. Julia allows you to range from tight low-level loops and conditionals, up to a high-level programming style, with its performance approaching and often matching the performance...
Learn to Program: Julia
https://intersect.org.au/training/course/julia101
https://dresa.org.au/materials/learn-to-program-julia
Julia is a high-level, high-performance dynamic programming language with more than 4,000 external libraries available. Julia allows you to range from tight low-level loops and conditionals, up to a high-level programming style, with its performance approaching and often matching the performance of the fastest programming languages!
This workshop expects that you are coming to Julia with some experience in the basic concepts of programming in another language. It is designed to help you migrate the basic concepts of programming that you already know to the Julia context.
Join us for this live coding workshop where we write programs that produce results, using Jupyter notebooks, which allow program code, results, visualisations and documentation to be blended seamlessly.
Introduction to the JupyterLab interface for programming
Basic syntax and data types in Julia
How to load external data into Julia
Creating functions (FUNCTIONS)
Repeating actions and analysing multiple data sets (LOOPS)
Making choices (IF STATEMENTS – CONDITIONALS)
Ways to visualise data using the Plots library in Julia
Some experience with the basic concepts of programming in another language needed to attend this course. It is an intensive course that is designed to help you migrate the basic concepts of programming that you already know to the Julia context in half a day instead of a full day. If you don’t have any prior experience in programming, please consider attending one of the \Learn to Program: Python\, \Learn to Program: R\ or \Learn to Program: MATLAB\ prior to this course.
We also strongly recommend attending the Start Coding without Hesitation: Programming Languages Showdown and Thinking like a computer: The Fundamentals of Programming webinars. Recordings of previously delivered webinars can be found \here\.
training@intersect.org.au
Intersect Australia
Julia
Beyond the Basics: Julia
Julia is a high-level, high-performance dynamic programming language with more than 4,000 external libraries available. Julia allows you to range from tight low-level loops and conditionals, up to a high-level programming style, with its performance approaching and often matching the performance...
Beyond the Basics: Julia
https://intersect.org.au/training/course/julia201
https://dresa.org.au/materials/beyond-the-basics-julia
Julia is a high-level, high-performance dynamic programming language with more than 4,000 external libraries available. Julia allows you to range from tight low-level loops and conditionals, up to a high-level programming style, with its performance approaching and often matching the performance of the fastest programming languages!
This workshop explores the more advanced features of functions in Julia, introduces widely used tools within Julia, as well as demonstrates the speed of Julia by benchmarking functions and different styles of scripting within Julia.
Join us for this live coding workshop where we write programs that produce results, using Jupyter notebooks, which allow program code, results, visualisations and documentation to be blended seamlessly.
Understand the role of Types within Julia
Create functions with complex arguments
Demonstrate programming patterns of list comprehension, pipes, and anonymous functions.
Benchmark Julia code and understand how to make it fast
If you already have experience with programming, please check the topics covered in the \Learn to Program: Julia\ to ensure that you are familiar with the knowledge needed for this course.
training@intersect.org.au
Intersect Australia
Julia