1 event found
Event types: Workshop
and
Organiser: The Twelfth International C...
and
Target audience: Life scientists or Data scientists
-
Decoding the grammar of DNA using Natural Language Processing
5 - 7 December 2023
Pensacola, United States of America
Decoding the grammar of DNA using Natural Language Processing https://www.k-cap.org/2023/tutorials.html https://dresa.org.au/events/decoding-the-grammar-of-dna-using-natural-language-processing **This workshop is jointly hosted by [Sonika Tyagi Lab](https://bioinformaticslab.erc.monash.edu/), and [Australian BioCommons](https://www.biocommons.org.au/).** Abstract: DNA is the blueprint defining all living organisms. Therefore, understanding the nature and function of DNA is at the core of all biological studies. Rapid advances in DNA sequencing and computing technologies over the past few decades resulted in large quantities of DNA generated for diverse experiments, exceeding the growth of all major social media platforms and astronomy data combined [1]. However, biological data is both complex and high-dimensional, and is difficult to analyse with conventional methods. Machine learning is naturally well suited to problems with a large volume of data and complexity [2]. In particular, applying Natural Language Processing to the genome is intuitive, since DNA is a natural language. Unique challenges exist in Genome-NLP over natural languages, including the difficulty of word segmentation or corpus comparison. To tackle these challenges, we developed the first automated and open-source genomeNLP workflow that enables efficient and accurate knowledge extraction on biological data [1], automating and abstracting preprocessing steps unique to biology. This lowers the barrier to perform knowledge extraction by both machine learning practitioners and computational biologists. In this tutorial, we will demonstrate how our workflow can be used to address the above challenges, with implications in fields such as personalised medicine [3-4]. [1] [preprint] Chen, T., Tyagi, N., Chauhan, S., Peleg, A.Y. and Tyagi, S., 2023. genomicBERT and data-free deep-learning model evaluation. bioRxiv, pp.2023-05. [https://doi.org/10.1101/2023.05.31.542682](https://doi.org/10.1101/2023.05.31.542682) (This article is a preprint and has not been certified by peer review) [2] Chen, T., Tyagi, S. Integrative computational epigenomics to build data-driven gene regulation hypotheses, GigaScience, Volume 9, Issue 6, June 2020, giaa064, [https://doi.org/10.1093/gigascience/giaa064](https://doi.org/10.1093/gigascience/giaa064) [3] Chen, T., Philip, M., Lê Cao, K-A., Tyagi, S. A multi-modal data harmonisation approach for discovery of COVID-19 drug targets, Briefings in Bioinformatics, Volume 22, Issue 6, November 2021, bbab185, [https://doi.org/10.1093/bib/bbab185](https://doi.org/10.1093/bib/bbab185) [4] Mu, A., Klare, W.P., Baines, S.L. et al. Integrative omics identifies conserved and pathogen-specific responses of sepsis-causing bacteria. Nat Commun 14, 1530 (2023). [https://doi.org/10.1038/s41467-023-37200-w](https://doi.org/10.1038/s41467-023-37200-w) 2023-12-05 09:00:00 UTC 2023-12-07 17:00:00 UTC The Twelfth International Conference on Knowledge Capture Pensacola, Pensacola, United States of America Pensacola Pensacola United States of America ACM Special Interest Group on Artificial IntelligenceFlorida Institute for Human & Machine Cognition (IHMC) tyrone.chen@monash.edu, sonika.tyagi@rmit.edu.au, christina.hall@biocommons.org.au, melissa.burke@biocommons.org.au [] bioinformaticiansLife scientistsData scientists 40 workshopconference by_invitation Deep learningMachine learningNatural language processingBioinformaticsComputational Biology
-
Decoding the grammar of DNA using Natural Language Processing
5 - 7 December 2023
Pensacola, United States of America
Decoding the grammar of DNA using Natural Language Processing https://www.k-cap.org/2023/tutorials.html https://dresa.org.au/events/decoding-the-grammar-of-dna-using-natural-language-processing **This workshop is jointly hosted by [Sonika Tyagi Lab](https://bioinformaticslab.erc.monash.edu/), and [Australian BioCommons](https://www.biocommons.org.au/).** Abstract: DNA is the blueprint defining all living organisms. Therefore, understanding the nature and function of DNA is at the core of all biological studies. Rapid advances in DNA sequencing and computing technologies over the past few decades resulted in large quantities of DNA generated for diverse experiments, exceeding the growth of all major social media platforms and astronomy data combined [1]. However, biological data is both complex and high-dimensional, and is difficult to analyse with conventional methods. Machine learning is naturally well suited to problems with a large volume of data and complexity [2]. In particular, applying Natural Language Processing to the genome is intuitive, since DNA is a natural language. Unique challenges exist in Genome-NLP over natural languages, including the difficulty of word segmentation or corpus comparison. To tackle these challenges, we developed the first automated and open-source genomeNLP workflow that enables efficient and accurate knowledge extraction on biological data [1], automating and abstracting preprocessing steps unique to biology. This lowers the barrier to perform knowledge extraction by both machine learning practitioners and computational biologists. In this tutorial, we will demonstrate how our workflow can be used to address the above challenges, with implications in fields such as personalised medicine [3-4]. [1] [preprint] Chen, T., Tyagi, N., Chauhan, S., Peleg, A.Y. and Tyagi, S., 2023. genomicBERT and data-free deep-learning model evaluation. bioRxiv, pp.2023-05. [https://doi.org/10.1101/2023.05.31.542682](https://doi.org/10.1101/2023.05.31.542682) (This article is a preprint and has not been certified by peer review) [2] Chen, T., Tyagi, S. Integrative computational epigenomics to build data-driven gene regulation hypotheses, GigaScience, Volume 9, Issue 6, June 2020, giaa064, [https://doi.org/10.1093/gigascience/giaa064](https://doi.org/10.1093/gigascience/giaa064) [3] Chen, T., Philip, M., Lê Cao, K-A., Tyagi, S. A multi-modal data harmonisation approach for discovery of COVID-19 drug targets, Briefings in Bioinformatics, Volume 22, Issue 6, November 2021, bbab185, [https://doi.org/10.1093/bib/bbab185](https://doi.org/10.1093/bib/bbab185) [4] Mu, A., Klare, W.P., Baines, S.L. et al. Integrative omics identifies conserved and pathogen-specific responses of sepsis-causing bacteria. Nat Commun 14, 1530 (2023). [https://doi.org/10.1038/s41467-023-37200-w](https://doi.org/10.1038/s41467-023-37200-w) 2023-12-05 09:00:00 UTC 2023-12-07 17:00:00 UTC The Twelfth International Conference on Knowledge Capture Pensacola, Pensacola, United States of America Pensacola Pensacola United States of America ACM Special Interest Group on Artificial IntelligenceFlorida Institute for Human & Machine Cognition (IHMC) tyrone.chen@monash.edu, sonika.tyagi@rmit.edu.au, christina.hall@biocommons.org.au, melissa.burke@biocommons.org.au [] bioinformaticiansLife scientistsData scientists 40 workshopconference by_invitation Deep learningMachine learningNatural language processingBioinformaticsComputational Biology
Note, this map only displays events that have geolocation information in
DReSA.
For the complete list of events in DReSA, click the grid tab.