WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and...
Keywords: Bioinformatics, Workflows, HPC, High Performance Computing
WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
https://zenodo.org/records/8008227
https://dresa.org.au/materials/webinar-pro-tips-for-scaling-bioinformatics-workflows-to-hpc-9f2a8b90-88da-433b-83b2-b1ab262dd9df
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and efficiency that life scientists need to handle complex biological datasets and multi-step computational workflows. But scaling workflows to HPC from smaller, more familiar computational infrastructures brings with it new jargon, expectations, and processes to learn. To make the most of HPC resources, bioinformatics workflows need to be designed for distributed computing environments and carefully manage varying resource requirements, and data scale related to biology.
In this webinar, Dr Georgina Samaha from the Sydney Informatics Hub, Dr Matthew Downton from the National Computational Infrastructure (NCI) and Dr Sarah Beecroft from the Pawsey Supercomputing Research Centre help you navigate the world of HPC for running and developing bioinformatics workflows. They explain when you should take your workflows to HPC and highlight the architectural features you should make the most of to scale your analyses once you’re there. You’ll hear pro-tips for dealing with common pain points like software installation, optimising for parallel computing and resource management, and will find out how to get access to Australia’s National HPC infrastructures at NCI and Pawsey.
Materials
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Pro-tips_HPC_Slides: A PDF copy of the slides presented during the webinar.
Materials shared elsewhere:
A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/YKJDRXCmGMo
Melissa Burke (melissa@biocommons.org.au)
Samaha, Georgina (orcid: 0000-0003-0419-1476)
Beecroft, Sarah (orcid: 0000-0002-3935-2279)
Downton, Matthew (orcid: 0000-0002-4693-1965)
Bioinformatics, Workflows, HPC, High Performance Computing
WORKSHOP: R: fundamental skills for biologists
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to...
Keywords: Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
WORKSHOP: R: fundamental skills for biologists
https://zenodo.org/records/6766951
https://dresa.org.au/materials/workshop-r-fundamental-skills-for-biologists-81aa00db-63ad-4962-a7ac-b885bf9f676b
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to interpret, visualise and communicate their research results. While Excel can cover some data analysis needs, there is a better choice, particularly for large and complex datasets.
R is a free, open-source software and programming language that enables data exploration, statistical analysis, visualisation and more. The large variety of R packages available for analysing biological data make it a robust and flexible option for data of all shapes and sizes.
Getting started can be a little daunting for those without a background in statistics and programming. In this workshop we will equip you with the foundations for getting the most out of R and RStudio, an interactive way of structuring and keeping track of your work in R. Using biological data from a model of influenza infection, you will learn how to efficiently and reproducibly organise, read, wrangle, analyse, visualise and generate reports from your data in R.
Topics covered in this workshop include:
Spreadsheets, organising data and first steps with R
Manipulating and analysing data with dplyr
Data visualisation
Summarized experiments and getting started with Bioconductor
This workshop is presented by the Australian BioCommons and Saskia Freytag from WEHI with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative.
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Schedule (PDF): A breakdown of the topics and timings for the workshop
Recommended resources (PDF): A list of resources recommended by trainers and participants
Q_and_A(PDF): Archive of questions and their answers from the workshop Slack Channel.
Materials shared elsewhere:
This workshop follows the tutorial ‘Introduction to data analysis with R and Bioconductor’ which is publicly available.
https://saskiafreytag.github.io/biocommons-r-intro/
This is derived from material produced as part of The Carpentries Incubator project
https://carpentries-incubator.github.io/bioc-intro/
Melissa Burke (melissa@biocommons.org.au)
Freytag, Saskia (orcid: 0000-0002-2185-7068)
Barugahare, Adele (orcid: 0000-0002-8976-0094)
Doyle, Maria
Ansell, Brendan (orcid: 0000-0003-0297-897X)
Varshney, Akriti
Bourke, Caitlin (orcid: 0000-0002-4466-6563)
Conradsen, Cara (orcid: 0000-0001-9797-3412)
Jung, Chol-Hee (orcid: 0000-0002-2992-3162)
Sandoval, Claudia
Chandrananda, Dineika (orcid: 0000-0002-8834-9500)
Zhang, Eden (orcid: 0000-0003-0294-3734)
Rosello, Fernando (orcid: 0000-0003-3885-8777)
Iacono, Giulia (orcid: 0000-0002-1527-0754)
Tarasova, Ilariya (orcid: 0000-0002-0895-9385)
Chung, Jessica (orcid: 0000-0002-0627-0955)
Moffet, Joel
Gustafsson, Johan (orcid: 0000-0002-2977-5032)
Ding, Ke
Feher, Kristen
Perlaza-Jimenez, Laura (orcid: 0000-0002-8511-1134)
Crowe, Mark (orcid: 0000-0002-9514-2487)
Ma, Mengyao
Kandhari, Nitika (orcid: 0000-0002-0261-727X)
Williams, Sarah
Nelson, Tiffanie (orcid: 0000-0002-5341-312X)
Schreiber, Veronika (orcid: 0000-0001-6088-7828)
Pinzon Perez, William
Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
ARDC FAIR Data 101 self-guided
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the...
Keywords: training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management
ARDC FAIR Data 101 self-guided
https://zenodo.org/records/5094034
https://dresa.org.au/materials/ardc-fair-data-101-self-guided-2d794a84-f0ff-4e11-a39c-fa8ea481e097
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the Scholarly Communications Lifecycle', run by Natasha Simons at the FORCE11 Scholarly Communications Institute. These training materials are hosted on GitHub.
contact@ardc.edu.au
Stokes, Liz (orcid: 0000-0002-2973-5647)
Liffers, Matthias (orcid: 0000-0002-3639-2080)
Burton, Nichola (orcid: 0000-0003-4470-4846)
Martinez, Paula A. (orcid: 0000-0002-8990-1985)
Simons, Natasha (orcid: 0000-0003-0635-1998)
Russell, Keith (orcid: 0000-0001-5390-2719)
McCafferty, Siobhann (orcid: 0000-0002-2491-0995)
Ferrers, Richard (orcid: 0000-0002-2923-9889)
McEachern, Steve (orcid: 0000-0001-7848-4912)
Barlow, Melanie (orcid: 0000-0002-3956-5784)
Brady, Catherine (orcid: 0000-0002-7919-7592)
Brownlee, Rowan (orcid: 0000-0002-1955-1262)
Honeyman, Tom (orcid: 0000-0001-9448-4023)
Quiroga, Maria del Mar (orcid: 0000-0002-8943-2808)
training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management
OpenCL
Supercomputers make use of accelerators from a variety of different hardware vendors, using devices such as multi-core CPU’s, GPU’s and even FPGA’s. OpenCL is a way for your HPC application to make effective use of heterogeneous computing devices, and to avoid code refactoring for new HPC...
Keywords: supercomputing, Pawsey Supercomputing Centre, CPUs, GPUs, OpenCL, FPGAs
Resource type: activity
OpenCL
https://www.youtube.com/playlist?list=PLmu61dgAX-aa_lk5fby5PjuS49snHpyYL
https://dresa.org.au/materials/opencl
Supercomputers make use of accelerators from a variety of different hardware vendors, using devices such as multi-core CPU’s, GPU’s and even FPGA’s. OpenCL is a way for your HPC application to make effective use of heterogeneous computing devices, and to avoid code refactoring for new HPC infrastructure.
training@pawsey.org.au
Toby Potter
Pawsey Supercomputing Research Centre
Pelagos
Toby Potter
supercomputing, Pawsey Supercomputing Centre, CPUs, GPUs, OpenCL, FPGAs
masters
ecr
researcher
support
HIP Workshop
The Heterogeneous Interface for Portability (HIP) provides a programming framework for harnessing the compute capabilities of multicore processors, such as the MI250X GPU’s on Setonix.
In this course we focus on the essentials of developing HIP applications with a focus on...
Keywords: HIP, supercomputing, Programming, GPUs, MPI, debugging
Resource type: full-course
HIP Workshop
https://support.pawsey.org.au/documentation/display/US/Pawsey+Training+Resources
https://dresa.org.au/materials/hip-workshop
The Heterogeneous Interface for Portability (HIP) provides a programming framework for harnessing the compute capabilities of multicore processors, such as the MI250X GPU’s on Setonix.
In this course we focus on the essentials of developing HIP applications with a focus on supercomputing.
Agenda
- Introduction to HIP and high level features
- How to build and run applications on Setonix with HIP and MPI
- A complete line-by-line walkthrough of a HIP-enabled application
- Tools and techniques for debugging and measuring the performance of HIP applications
training@pawsey.org.au
Pelagos
Pawsey Supercomputing Research Centre
HIP, supercomputing, Programming, GPUs, MPI, debugging
C/C++ Refresher
The C++ programming language and its C subset is used extensively in research environments. In particular it is the language utilised in the parallel programming frameworks CUDA, HIP, and OpenCL.
This workshop is designed to equip participants with “Survival C++”, an understanding of the basic...
Keywords: supercomputing, C/C++, Programming
Resource type: activity
C/C++ Refresher
https://www.youtube.com/playlist?list=PLmu61dgAX-aYsRsejVfwHVhpPU2381Njg
https://dresa.org.au/materials/c-c-refresher
The C++ programming language and its C subset is used extensively in research environments. In particular it is the language utilised in the parallel programming frameworks CUDA, HIP, and OpenCL.
This workshop is designed to equip participants with “Survival C++”, an understanding of the basic syntax, how information is encoded in binary format, and how to compile and debug C++ software.
training@pawsey.org.au
Pelagos
Pawsey Supercomputing Research Centre
supercomputing, C/C++, Programming
WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and...
Keywords: Bioinformatics, Workflows, HPC, High Performance Computing
WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
https://zenodo.org/record/8008227
https://dresa.org.au/materials/webinar-pro-tips-for-scaling-bioinformatics-workflows-to-hpc
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and efficiency that life scientists need to handle complex biological datasets and multi-step computational workflows. But scaling workflows to HPC from smaller, more familiar computational infrastructures brings with it new jargon, expectations, and processes to learn. To make the most of HPC resources, bioinformatics workflows need to be designed for distributed computing environments and carefully manage varying resource requirements, and data scale related to biology.
In this webinar, Dr Georgina Samaha from the Sydney Informatics Hub, Dr Matthew Downton from the National Computational Infrastructure (NCI) and Dr Sarah Beecroft from the Pawsey Supercomputing Research Centre help you navigate the world of HPC for running and developing bioinformatics workflows. They explain when you should take your workflows to HPC and highlight the architectural features you should make the most of to scale your analyses once you’re there. You’ll hear pro-tips for dealing with common pain points like software installation, optimising for parallel computing and resource management, and will find out how to get access to Australia’s National HPC infrastructures at NCI and Pawsey.
Materials
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Pro-tips_HPC_Slides: A PDF copy of the slides presented during the webinar.
Materials shared elsewhere:
A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/YKJDRXCmGMo
Melissa Burke (melissa@biocommons.org.au)
Samaha, Georgina (orcid: 0000-0003-0419-1476)
Beecroft, Sarah (orcid: 0000-0002-3935-2279)
Downton, Matthew (orcid: 0000-0002-4693-1965)
Bioinformatics, Workflows, HPC, High Performance Computing
WORKSHOP: R: fundamental skills for biologists
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to...
Keywords: Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
WORKSHOP: R: fundamental skills for biologists
https://zenodo.org/record/6766951
https://dresa.org.au/materials/workshop-r-fundamental-skills-for-biologists
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
**Event description**
Biologists need data analysis skills to be able to interpret, visualise and communicate their research results. While Excel can cover some data analysis needs, there is a better choice, particularly for large and complex datasets.
R is a free, open-source software and programming language that enables data exploration, statistical analysis, visualisation and more. The large variety of R packages available for analysing biological data make it a robust and flexible option for data of all shapes and sizes.
Getting started can be a little daunting for those without a background in statistics and programming. In this workshop we will equip you with the foundations for getting the most out of R and RStudio, an interactive way of structuring and keeping track of your work in R. Using biological data from a model of influenza infection, you will learn how to efficiently and reproducibly organise, read, wrangle, analyse, visualise and generate reports from your data in R.
Topics covered in this workshop include:
- Spreadsheets, organising data and first steps with R
- Manipulating and analysing data with dplyr
- Data visualisation
- Summarized experiments and getting started with Bioconductor
This workshop is presented by the Australian BioCommons and Saskia Freytag from WEHI with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative.
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
**Files and materials included in this record:**
- Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
- Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
- Schedule (PDF): A breakdown of the topics and timings for the workshop
- Recommended resources (PDF): A list of resources recommended by trainers and participants
- Q_and_A(PDF): Archive of questions and their answers from the workshop Slack Channel.
**Materials shared elsewhere:**
This workshop follows the tutorial ‘Introduction to data analysis with R and Bioconductor’ which is publicly available.
https://saskiafreytag.github.io/biocommons-r-intro/
This is derived from material produced as part of The Carpentries Incubator project
https://carpentries-incubator.github.io/bioc-intro/
Melissa Burke (melissa@biocommons.org.au)
Freytag, Saskia (orcid: 0000-0002-2185-7068)
Barugahare, Adele (orcid: 0000-0002-8976-0094)
Doyle, Maria
Ansell, Brendan (orcid: 0000-0003-0297-897X)
Varshney, Akriti
Bourke, Caitlin (orcid: 0000-0002-4466-6563)
Conradsen, Cara (orcid: 0000-0001-9797-3412)
Jung, Chol-Hee (orcid: 0000-0002-2992-3162)
Sandoval, Claudia
Chandrananda, Dineika (orcid: 0000-0002-8834-9500)
Zhang, Eden (orcid: 0000-0003-0294-3734)
Rosello, Fernando (orcid: 0000-0003-3885-8777)
Iacono, Giulia (orcid: 0000-0002-1527-0754)
Tarasova, Ilariya (orcid: 0000-0002-0895-9385)
Chung, Jessica (orcid: 0000-0002-0627-0955)
Moffet, Joel
Gustafsson, Johan (orcid: 0000-0002-2977-5032)
Ding, Ke
Feher, Kristen
Perlaza-Jimenez, Laura (orcid: 0000-0002-8511-1134)
Crowe, Mark (orcid: 0000-0002-9514-2487)
Ma, Mengyao
Kandhari, Nitika (orcid: 0000-0002-0261-727X)
Williams, Sarah
Nelson, Tiffanie (orcid: 0000-0002-5341-312X)
Schreiber, Veronika (orcid: 0000-0001-6088-7828)
Pinzon Perez, William
Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
ARDC FAIR Data 101 self-guided
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the...
Keywords: training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management
ARDC FAIR Data 101 self-guided
https://zenodo.org/record/5094034
https://dresa.org.au/materials/ardc-fair-data-101-self-guided-bba41a59-8479-4f4f-b9ee-337b9eb294bf
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the Scholarly Communications Lifecycle', run by Natasha Simons at the FORCE11 Scholarly Communications Institute. These training materials are hosted on GitHub.
contact@ardc.edu.au
Stokes, Liz (orcid: 0000-0002-2973-5647)
Liffers, Matthias (orcid: 0000-0002-3639-2080)
Burton, Nichola (orcid: 0000-0003-4470-4846)
Martinez, Paula A. (orcid: 0000-0002-8990-1985)
Simons, Natasha (orcid: 0000-0003-0635-1998)
Russell, Keith (orcid: 0000-0001-5390-2719)
McCafferty, Siobhann (orcid: 0000-0002-2491-0995)
Ferrers, Richard (orcid: 0000-0002-2923-9889)
McEachern, Steve (orcid: 0000-0001-7848-4912)
Barlow, Melanie (orcid: 0000-0002-3956-5784)
Brady, Catherine (orcid: 0000-0002-7919-7592)
Brownlee, Rowan (orcid: 0000-0002-1955-1262)
Honeyman, Tom (orcid: 0000-0001-9448-4023)
Quiroga, Maria del Mar (orcid: 0000-0002-8943-2808)
training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management