WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and...
Keywords: Bioinformatics, Workflows, HPC, High Performance Computing
WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
https://zenodo.org/records/8008227
https://dresa.org.au/materials/webinar-pro-tips-for-scaling-bioinformatics-workflows-to-hpc-9f2a8b90-88da-433b-83b2-b1ab262dd9df
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and efficiency that life scientists need to handle complex biological datasets and multi-step computational workflows. But scaling workflows to HPC from smaller, more familiar computational infrastructures brings with it new jargon, expectations, and processes to learn. To make the most of HPC resources, bioinformatics workflows need to be designed for distributed computing environments and carefully manage varying resource requirements, and data scale related to biology.
In this webinar, Dr Georgina Samaha from the Sydney Informatics Hub, Dr Matthew Downton from the National Computational Infrastructure (NCI) and Dr Sarah Beecroft from the Pawsey Supercomputing Research Centre help you navigate the world of HPC for running and developing bioinformatics workflows. They explain when you should take your workflows to HPC and highlight the architectural features you should make the most of to scale your analyses once you’re there. You’ll hear pro-tips for dealing with common pain points like software installation, optimising for parallel computing and resource management, and will find out how to get access to Australia’s National HPC infrastructures at NCI and Pawsey.
Materials
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Pro-tips_HPC_Slides: A PDF copy of the slides presented during the webinar.
Materials shared elsewhere:
A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/YKJDRXCmGMo
Melissa Burke (melissa@biocommons.org.au)
Samaha, Georgina (orcid: 0000-0003-0419-1476)
Beecroft, Sarah (orcid: 0000-0002-3935-2279)
Downton, Matthew (orcid: 0000-0002-4693-1965)
Bioinformatics, Workflows, HPC, High Performance Computing
WORKSHOP: R: fundamental skills for biologists
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to...
Keywords: Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
WORKSHOP: R: fundamental skills for biologists
https://zenodo.org/records/6766951
https://dresa.org.au/materials/workshop-r-fundamental-skills-for-biologists-81aa00db-63ad-4962-a7ac-b885bf9f676b
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to interpret, visualise and communicate their research results. While Excel can cover some data analysis needs, there is a better choice, particularly for large and complex datasets.
R is a free, open-source software and programming language that enables data exploration, statistical analysis, visualisation and more. The large variety of R packages available for analysing biological data make it a robust and flexible option for data of all shapes and sizes.
Getting started can be a little daunting for those without a background in statistics and programming. In this workshop we will equip you with the foundations for getting the most out of R and RStudio, an interactive way of structuring and keeping track of your work in R. Using biological data from a model of influenza infection, you will learn how to efficiently and reproducibly organise, read, wrangle, analyse, visualise and generate reports from your data in R.
Topics covered in this workshop include:
Spreadsheets, organising data and first steps with R
Manipulating and analysing data with dplyr
Data visualisation
Summarized experiments and getting started with Bioconductor
This workshop is presented by the Australian BioCommons and Saskia Freytag from WEHI with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative.
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Schedule (PDF): A breakdown of the topics and timings for the workshop
Recommended resources (PDF): A list of resources recommended by trainers and participants
Q_and_A(PDF): Archive of questions and their answers from the workshop Slack Channel.
Materials shared elsewhere:
This workshop follows the tutorial ‘Introduction to data analysis with R and Bioconductor’ which is publicly available.
https://saskiafreytag.github.io/biocommons-r-intro/
This is derived from material produced as part of The Carpentries Incubator project
https://carpentries-incubator.github.io/bioc-intro/
Melissa Burke (melissa@biocommons.org.au)
Freytag, Saskia (orcid: 0000-0002-2185-7068)
Barugahare, Adele (orcid: 0000-0002-8976-0094)
Doyle, Maria
Ansell, Brendan (orcid: 0000-0003-0297-897X)
Varshney, Akriti
Bourke, Caitlin (orcid: 0000-0002-4466-6563)
Conradsen, Cara (orcid: 0000-0001-9797-3412)
Jung, Chol-Hee (orcid: 0000-0002-2992-3162)
Sandoval, Claudia
Chandrananda, Dineika (orcid: 0000-0002-8834-9500)
Zhang, Eden (orcid: 0000-0003-0294-3734)
Rosello, Fernando (orcid: 0000-0003-3885-8777)
Iacono, Giulia (orcid: 0000-0002-1527-0754)
Tarasova, Ilariya (orcid: 0000-0002-0895-9385)
Chung, Jessica (orcid: 0000-0002-0627-0955)
Moffet, Joel
Gustafsson, Johan (orcid: 0000-0002-2977-5032)
Ding, Ke
Feher, Kristen
Perlaza-Jimenez, Laura (orcid: 0000-0002-8511-1134)
Crowe, Mark (orcid: 0000-0002-9514-2487)
Ma, Mengyao
Kandhari, Nitika (orcid: 0000-0002-0261-727X)
Williams, Sarah
Nelson, Tiffanie (orcid: 0000-0002-5341-312X)
Schreiber, Veronika (orcid: 0000-0001-6088-7828)
Pinzon Perez, William
Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
Time to fill the gaps: Building out a national training inventory
This community discussion seeks to bring together the instructors and facilitators tasked with upskilling researchers and support staff. While this collective dialogue among instructors is not new, what is new is the traction that various groups are getting.
The newly formed group of...
Keywords: training inventory, training registry, national skills initiatives, training material
Time to fill the gaps: Building out a national training inventory
https://zenodo.org/records/4287858
https://dresa.org.au/materials/time-to-fill-the-gaps-building-out-a-national-training-inventory-cd4f10d8-83c0-4870-95e2-ee9ed4aa72c7
This community discussion seeks to bring together the instructors and facilitators tasked with upskilling researchers and support staff. While this collective dialogue among instructors is not new, what is new is the traction that various groups are getting.
The newly formed group of eResearch support staff gathered by the Melbourne Data Analytics Platform (MDAP) and Sydney Informatics Hub (SIH) is one such group, as is the Lightweight Working Group (LWG): Researcher digital skills training data for enabling digital infrastructure use, spearheaded by University of Melbourne’s David Flanders during the pre-Skills Summit discussions.
In this session we seek to build on the momentum, by including a hands-on working session. Participants are asked to come with information to share and questions they seek to answer. During the first half of this session, attendees will populate a public document with shareable training details. The goal is to at least double the size of the new cross-institutional national training collection started by the LWG.
The second half of this session will be to ask questions to arrive at next steps. What do we need to do to continue building out this national training inventory and who will be in charge of maintaining and distributing the archive? What platforms exist and are used to capture training data and material and make it readily maintainable and findable? Can the material be reused and how do we recognise and capture re-use? Do we know about how to apply a license to our materials for appropriate reuse or do we need guidance?
While there will likely be more questions than these, one question has been answered. When can we move from talking to doing? That time is now.
contact@ardc.edu.au
Backhaus, Ann
Lange, Rebecca (orcid: 0000-0002-9449-4384)
Padmanabhan, Komathy
King, Sara (orcid: 0000-0003-3199-5592)
training inventory, training registry, national skills initiatives, training material
National skills ecosystem - call to action
In this Community Action session working groups will be formed based on the challenges/opportunities that were prioritised in Community Action session #4.
Skilled trainers / facilitators
National training registry
National training event calendar
Jointly developed training
Research...
Keywords: national skills initiatives, data skills, training, skills community, training material
National skills ecosystem - call to action
https://zenodo.org/records/4289335
https://dresa.org.au/materials/national-skills-ecosystem-call-to-action-ffd9b4ed-b557-496b-ac35-72467c03c71b
In this Community Action session working groups will be formed based on the challenges/opportunities that were prioritised in Community Action session #4.
- Skilled trainers / facilitators
- National training registry
- National training event calendar
- Jointly developed training
- Research support professionals: career/progression
contact@ardc.edu.au
Padmanabhan, Komathy
Backhaus, Ann
Papaioannou, Anastasios (orcid: 0000-0002-8959-4559)
Tang, Titus
Crowe, Mark (orcid: 0000-0002-9514-2487)
Vanichkina, Darya (orcid: 0000-0002-0406-164X)
Unsworth, Kathryn (orcid: 0000-0002-5407-9987)
Stokes, Liz (orcid: 0000-0002-2973-5647)
Liffers, Matthias (orcid: 0000-0002-3639-2080)
national skills initiatives, data skills, training, skills community, training material
ARDC FAIR Data 101 self-guided
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the...
Keywords: training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management
ARDC FAIR Data 101 self-guided
https://zenodo.org/records/5094034
https://dresa.org.au/materials/ardc-fair-data-101-self-guided-2d794a84-f0ff-4e11-a39c-fa8ea481e097
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the Scholarly Communications Lifecycle', run by Natasha Simons at the FORCE11 Scholarly Communications Institute. These training materials are hosted on GitHub.
contact@ardc.edu.au
Stokes, Liz (orcid: 0000-0002-2973-5647)
Liffers, Matthias (orcid: 0000-0002-3639-2080)
Burton, Nichola (orcid: 0000-0003-4470-4846)
Martinez, Paula A. (orcid: 0000-0002-8990-1985)
Simons, Natasha (orcid: 0000-0003-0635-1998)
Russell, Keith (orcid: 0000-0001-5390-2719)
McCafferty, Siobhann (orcid: 0000-0002-2491-0995)
Ferrers, Richard (orcid: 0000-0002-2923-9889)
McEachern, Steve (orcid: 0000-0001-7848-4912)
Barlow, Melanie (orcid: 0000-0002-3956-5784)
Brady, Catherine (orcid: 0000-0002-7919-7592)
Brownlee, Rowan (orcid: 0000-0002-1955-1262)
Honeyman, Tom (orcid: 0000-0001-9448-4023)
Quiroga, Maria del Mar (orcid: 0000-0002-8943-2808)
training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management
WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and...
Keywords: Bioinformatics, Workflows, HPC, High Performance Computing
WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
https://zenodo.org/record/8008227
https://dresa.org.au/materials/webinar-pro-tips-for-scaling-bioinformatics-workflows-to-hpc
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and efficiency that life scientists need to handle complex biological datasets and multi-step computational workflows. But scaling workflows to HPC from smaller, more familiar computational infrastructures brings with it new jargon, expectations, and processes to learn. To make the most of HPC resources, bioinformatics workflows need to be designed for distributed computing environments and carefully manage varying resource requirements, and data scale related to biology.
In this webinar, Dr Georgina Samaha from the Sydney Informatics Hub, Dr Matthew Downton from the National Computational Infrastructure (NCI) and Dr Sarah Beecroft from the Pawsey Supercomputing Research Centre help you navigate the world of HPC for running and developing bioinformatics workflows. They explain when you should take your workflows to HPC and highlight the architectural features you should make the most of to scale your analyses once you’re there. You’ll hear pro-tips for dealing with common pain points like software installation, optimising for parallel computing and resource management, and will find out how to get access to Australia’s National HPC infrastructures at NCI and Pawsey.
Materials
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Pro-tips_HPC_Slides: A PDF copy of the slides presented during the webinar.
Materials shared elsewhere:
A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/YKJDRXCmGMo
Melissa Burke (melissa@biocommons.org.au)
Samaha, Georgina (orcid: 0000-0003-0419-1476)
Beecroft, Sarah (orcid: 0000-0002-3935-2279)
Downton, Matthew (orcid: 0000-0002-4693-1965)
Bioinformatics, Workflows, HPC, High Performance Computing
WORKSHOP: R: fundamental skills for biologists
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to...
Keywords: Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
WORKSHOP: R: fundamental skills for biologists
https://zenodo.org/record/6766951
https://dresa.org.au/materials/workshop-r-fundamental-skills-for-biologists
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
**Event description**
Biologists need data analysis skills to be able to interpret, visualise and communicate their research results. While Excel can cover some data analysis needs, there is a better choice, particularly for large and complex datasets.
R is a free, open-source software and programming language that enables data exploration, statistical analysis, visualisation and more. The large variety of R packages available for analysing biological data make it a robust and flexible option for data of all shapes and sizes.
Getting started can be a little daunting for those without a background in statistics and programming. In this workshop we will equip you with the foundations for getting the most out of R and RStudio, an interactive way of structuring and keeping track of your work in R. Using biological data from a model of influenza infection, you will learn how to efficiently and reproducibly organise, read, wrangle, analyse, visualise and generate reports from your data in R.
Topics covered in this workshop include:
- Spreadsheets, organising data and first steps with R
- Manipulating and analysing data with dplyr
- Data visualisation
- Summarized experiments and getting started with Bioconductor
This workshop is presented by the Australian BioCommons and Saskia Freytag from WEHI with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative.
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
**Files and materials included in this record:**
- Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
- Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
- Schedule (PDF): A breakdown of the topics and timings for the workshop
- Recommended resources (PDF): A list of resources recommended by trainers and participants
- Q_and_A(PDF): Archive of questions and their answers from the workshop Slack Channel.
**Materials shared elsewhere:**
This workshop follows the tutorial ‘Introduction to data analysis with R and Bioconductor’ which is publicly available.
https://saskiafreytag.github.io/biocommons-r-intro/
This is derived from material produced as part of The Carpentries Incubator project
https://carpentries-incubator.github.io/bioc-intro/
Melissa Burke (melissa@biocommons.org.au)
Freytag, Saskia (orcid: 0000-0002-2185-7068)
Barugahare, Adele (orcid: 0000-0002-8976-0094)
Doyle, Maria
Ansell, Brendan (orcid: 0000-0003-0297-897X)
Varshney, Akriti
Bourke, Caitlin (orcid: 0000-0002-4466-6563)
Conradsen, Cara (orcid: 0000-0001-9797-3412)
Jung, Chol-Hee (orcid: 0000-0002-2992-3162)
Sandoval, Claudia
Chandrananda, Dineika (orcid: 0000-0002-8834-9500)
Zhang, Eden (orcid: 0000-0003-0294-3734)
Rosello, Fernando (orcid: 0000-0003-3885-8777)
Iacono, Giulia (orcid: 0000-0002-1527-0754)
Tarasova, Ilariya (orcid: 0000-0002-0895-9385)
Chung, Jessica (orcid: 0000-0002-0627-0955)
Moffet, Joel
Gustafsson, Johan (orcid: 0000-0002-2977-5032)
Ding, Ke
Feher, Kristen
Perlaza-Jimenez, Laura (orcid: 0000-0002-8511-1134)
Crowe, Mark (orcid: 0000-0002-9514-2487)
Ma, Mengyao
Kandhari, Nitika (orcid: 0000-0002-0261-727X)
Williams, Sarah
Nelson, Tiffanie (orcid: 0000-0002-5341-312X)
Schreiber, Veronika (orcid: 0000-0001-6088-7828)
Pinzon Perez, William
Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
National skills ecosystem - call to action
In this Community Action session working groups will be formed based on the challenges/opportunities that were prioritised in Community Action session #4.
Skilled trainers / facilitators
National training registry
National training event calendar
Jointly developed training
Research...
Keywords: national skills initiatives, data skills, training, skills community, training material
National skills ecosystem - call to action
https://zenodo.org/record/4289335
https://dresa.org.au/materials/national-skills-ecosystem-call-to-action
In this Community Action session working groups will be formed based on the challenges/opportunities that were prioritised in Community Action session #4.
- Skilled trainers / facilitators
- National training registry
- National training event calendar
- Jointly developed training
- Research support professionals: career/progression
contact@ardc.edu.au
Padmanabhan, Komathy
Backhaus, Ann
Papaioannou, Anastasios (orcid: 0000-0002-8959-4559)
Tang, Titus
Crowe, Mark (orcid: 0000-0002-9514-2487)
Vanichkina, Darya (orcid: 0000-0002-0406-164X)
Unsworth, Kathryn (orcid: 0000-0002-5407-9987)
Stokes, Liz (orcid: 0000-0002-2973-5647)
Liffers, Matthias (orcid: 0000-0002-3639-2080)
national skills initiatives, data skills, training, skills community, training material
Time to fill the gaps: Building out a national training inventory
This community discussion seeks to bring together the instructors and facilitators tasked with upskilling researchers and support staff. While this collective dialogue among instructors is not new, what is new is the traction that various groups are getting.
The newly formed group of...
Keywords: training inventory, training registry, national skills initiatives, training material
Time to fill the gaps: Building out a national training inventory
https://zenodo.org/record/4287858
https://dresa.org.au/materials/time-to-fill-the-gaps-building-out-a-national-training-inventory
This community discussion seeks to bring together the instructors and facilitators tasked with upskilling researchers and support staff. While this collective dialogue among instructors is not new, what is new is the traction that various groups are getting.
The newly formed group of eResearch support staff gathered by the Melbourne Data Analytics Platform (MDAP) and Sydney Informatics Hub (SIH) is one such group, as is the Lightweight Working Group (LWG): Researcher digital skills training data for enabling digital infrastructure use, spearheaded by University of Melbourne’s David Flanders during the pre-Skills Summit discussions.
In this session we seek to build on the momentum, by including a hands-on working session. Participants are asked to come with information to share and questions they seek to answer. During the first half of this session, attendees will populate a public document with shareable training details. The goal is to at least double the size of the new cross-institutional national training collection started by the LWG.
The second half of this session will be to ask questions to arrive at next steps. What do we need to do to continue building out this national training inventory and who will be in charge of maintaining and distributing the archive? What platforms exist and are used to capture training data and material and make it readily maintainable and findable? Can the material be reused and how do we recognise and capture re-use? Do we know about how to apply a license to our materials for appropriate reuse or do we need guidance?
While there will likely be more questions than these, one question has been answered. When can we move from talking to doing? That time is now.
contact@ardc.edu.au
Backhaus, Ann
Lange, Rebecca (orcid: 0000-0002-9449-4384)
Padmanabhan, Komathy
King, Sara (orcid: 0000-0003-3199-5592)
training inventory, training registry, national skills initiatives, training material
ARDC FAIR Data 101 self-guided
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the...
Keywords: training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management
ARDC FAIR Data 101 self-guided
https://zenodo.org/record/5094034
https://dresa.org.au/materials/ardc-fair-data-101-self-guided-bba41a59-8479-4f4f-b9ee-337b9eb294bf
FAIR Data 101 v3.0 is a self-guided course covering the FAIR Data principles
The FAIR Data 101 virtual course was designed and delivered by the ARDC Skilled Workforce Program twice in 2020 and has now been reworked as a self-guided course.
The course structure was based on 'FAIR Data in the Scholarly Communications Lifecycle', run by Natasha Simons at the FORCE11 Scholarly Communications Institute. These training materials are hosted on GitHub.
contact@ardc.edu.au
Stokes, Liz (orcid: 0000-0002-2973-5647)
Liffers, Matthias (orcid: 0000-0002-3639-2080)
Burton, Nichola (orcid: 0000-0003-4470-4846)
Martinez, Paula A. (orcid: 0000-0002-8990-1985)
Simons, Natasha (orcid: 0000-0003-0635-1998)
Russell, Keith (orcid: 0000-0001-5390-2719)
McCafferty, Siobhann (orcid: 0000-0002-2491-0995)
Ferrers, Richard (orcid: 0000-0002-2923-9889)
McEachern, Steve (orcid: 0000-0001-7848-4912)
Barlow, Melanie (orcid: 0000-0002-3956-5784)
Brady, Catherine (orcid: 0000-0002-7919-7592)
Brownlee, Rowan (orcid: 0000-0002-1955-1262)
Honeyman, Tom (orcid: 0000-0001-9448-4023)
Quiroga, Maria del Mar (orcid: 0000-0002-8943-2808)
training material, FAIR data, video, webinar, activities, quiz, FAIR, research data management