WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and...
Keywords: Bioinformatics, Workflows, HPC, High Performance Computing
WEBINAR: Pro tips for scaling bioinformatics workflows to HPC
https://zenodo.org/records/8008227
https://dresa.org.au/materials/webinar-pro-tips-for-scaling-bioinformatics-workflows-to-hpc-9f2a8b90-88da-433b-83b2-b1ab262dd9df
This record includes training materials associated with the Australian BioCommons webinar ‘Pro tips for scaling bioinformatics workflows to HPC’. This webinar took place on 31 May 2023.
Event description
High Performance Computing (HPC) infrastructures offer the computational scale and efficiency that life scientists need to handle complex biological datasets and multi-step computational workflows. But scaling workflows to HPC from smaller, more familiar computational infrastructures brings with it new jargon, expectations, and processes to learn. To make the most of HPC resources, bioinformatics workflows need to be designed for distributed computing environments and carefully manage varying resource requirements, and data scale related to biology.
In this webinar, Dr Georgina Samaha from the Sydney Informatics Hub, Dr Matthew Downton from the National Computational Infrastructure (NCI) and Dr Sarah Beecroft from the Pawsey Supercomputing Research Centre help you navigate the world of HPC for running and developing bioinformatics workflows. They explain when you should take your workflows to HPC and highlight the architectural features you should make the most of to scale your analyses once you’re there. You’ll hear pro-tips for dealing with common pain points like software installation, optimising for parallel computing and resource management, and will find out how to get access to Australia’s National HPC infrastructures at NCI and Pawsey.
Materials
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Pro-tips_HPC_Slides: A PDF copy of the slides presented during the webinar.
Materials shared elsewhere:
A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/YKJDRXCmGMo
Melissa Burke (melissa@biocommons.org.au)
Samaha, Georgina (orcid: 0000-0003-0419-1476)
Beecroft, Sarah (orcid: 0000-0002-3935-2279)
Downton, Matthew (orcid: 0000-0002-4693-1965)
Bioinformatics, Workflows, HPC, High Performance Computing
WORKSHOP: R: fundamental skills for biologists
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to...
Keywords: Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation
WORKSHOP: R: fundamental skills for biologists
https://zenodo.org/records/6766951
https://dresa.org.au/materials/workshop-r-fundamental-skills-for-biologists-81aa00db-63ad-4962-a7ac-b885bf9f676b
This record includes training materials associated with the Australian BioCommons workshop ‘R: fundamental skills for biologists’. This workshop took place over four, three-hour sessions on 1, 8, 15 and 22 June 2022.
Event description
Biologists need data analysis skills to be able to interpret, visualise and communicate their research results. While Excel can cover some data analysis needs, there is a better choice, particularly for large and complex datasets.
R is a free, open-source software and programming language that enables data exploration, statistical analysis, visualisation and more. The large variety of R packages available for analysing biological data make it a robust and flexible option for data of all shapes and sizes.
Getting started can be a little daunting for those without a background in statistics and programming. In this workshop we will equip you with the foundations for getting the most out of R and RStudio, an interactive way of structuring and keeping track of your work in R. Using biological data from a model of influenza infection, you will learn how to efficiently and reproducibly organise, read, wrangle, analyse, visualise and generate reports from your data in R.
Topics covered in this workshop include:
Spreadsheets, organising data and first steps with R
Manipulating and analysing data with dplyr
Data visualisation
Summarized experiments and getting started with Bioconductor
This workshop is presented by the Australian BioCommons and Saskia Freytag from WEHI with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative.
Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event.
Files and materials included in this record:
Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc.
Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file.
Schedule (PDF): A breakdown of the topics and timings for the workshop
Recommended resources (PDF): A list of resources recommended by trainers and participants
Q_and_A(PDF): Archive of questions and their answers from the workshop Slack Channel.
Materials shared elsewhere:
This workshop follows the tutorial ‘Introduction to data analysis with R and Bioconductor’ which is publicly available.
https://saskiafreytag.github.io/biocommons-r-intro/
This is derived from material produced as part of The Carpentries Incubator project
https://carpentries-incubator.github.io/bioc-intro/
Melissa Burke (melissa@biocommons.org.au)
Freytag, Saskia (orcid: 0000-0002-2185-7068)
Barugahare, Adele (orcid: 0000-0002-8976-0094)
Doyle, Maria
Ansell, Brendan (orcid: 0000-0003-0297-897X)
Varshney, Akriti
Bourke, Caitlin (orcid: 0000-0002-4466-6563)
Conradsen, Cara (orcid: 0000-0001-9797-3412)
Jung, Chol-Hee (orcid: 0000-0002-2992-3162)
Sandoval, Claudia
Chandrananda, Dineika (orcid: 0000-0002-8834-9500)
Zhang, Eden (orcid: 0000-0003-0294-3734)
Rosello, Fernando (orcid: 0000-0003-3885-8777)
Iacono, Giulia (orcid: 0000-0002-1527-0754)
Tarasova, Ilariya (orcid: 0000-0002-0895-9385)
Chung, Jessica (orcid: 0000-0002-0627-0955)
Moffet, Joel
Gustafsson, Johan (orcid: 0000-0002-2977-5032)
Ding, Ke
Feher, Kristen
Perlaza-Jimenez, Laura (orcid: 0000-0002-8511-1134)
Crowe, Mark (orcid: 0000-0002-9514-2487)
Ma, Mengyao
Kandhari, Nitika (orcid: 0000-0002-0261-727X)
Williams, Sarah
Nelson, Tiffanie (orcid: 0000-0002-5341-312X)
Schreiber, Veronika (orcid: 0000-0001-6088-7828)
Pinzon Perez, William
Bioinformatics, Analysis, Statistics, R software, RStudio, Data visualisation