Benchmarking and Acceleration of Machine Learning and Analytics Pipelines for Large Microbiome Datasets

Benchmarking and Acceleration of Machine Learning and Analytics Pipelines for Large Microbiome Datasets
Author :
Publisher :
Total Pages : 0
Release :
ISBN-10 : OCLC:1334848858
ISBN-13 :
Rating : 4/5 ( Downloads)

Book Synopsis Benchmarking and Acceleration of Machine Learning and Analytics Pipelines for Large Microbiome Datasets by : George Wesley Armstrong

Download or read book Benchmarking and Acceleration of Machine Learning and Analytics Pipelines for Large Microbiome Datasets written by George Wesley Armstrong and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Within the past decade, the number of publicly available microbiome sequencing samples has increased dramatically. Consequently, bottlenecks have arisen in common analysis steps, such as processing the sequencing data and characterizing the content of the microbial communities. Over this timespan, new tools have also been developed for steps such as alignment and dimensionality reduction that scale better or handle the additional complexity of high-dimensional data, however, their characteristics on microbiome data were previously uncharacterized. In this dissertation, we accelerate the analysis of microbiomes by introducing new methods or benchmarking alternatives. Additionally, we compare the results of novel methodology to existing best-practices on gold-standard datasets to determine whether the methods adequately address the specific challenges of microbiome data. In the first part of this work, Chapter 1 reviews many aspects of microbiome data that necessitate the use of microbiome-specific techniques for analyzing collections of microbial communities. Chapter 2 then introduces SFPhD, a novel approach for calculating phylogenetic alpha diversity that leverages the characteristics of microbiome data to speed up and reduce the memory requirements of a costly single-sample characterization. In the second part of the work, we apply recently developed tools for machine learning and sequencing pre-processing to demonstrate their potential for elucidating complex relationships in microbial data and reducing the lead time for supporting clinical applications of metagenomic sequencing, respectively. Chapter 3 demonstrates how Uniform Manifold Approximation and Projection (UMAP) provides succinct representations of data compared to the long-time standard method of microbial ecology, Principal Coordinates Analysis (PCoA). Importantly, UMAP provides different guarantees about the preservation of local/global geometry in its representation and careful consideration should be given to its application. In Chapter 4, we show that the popular metagenomic preprocessing pipeline of Atropos for adapter trimming and Bowtie2 for host filtering can be replaced by a substantially faster combination of Fastp and Minimap2, respectively. Furthermore, we have determined that the results this new pipeline produces are comparable to the outputs produced by the original pipeline.


Benchmarking and Acceleration of Machine Learning and Analytics Pipelines for Large Microbiome Datasets Related Books

Benchmarking and Acceleration of Machine Learning and Analytics Pipelines for Large Microbiome Datasets
Language: en
Pages: 0
Authors: George Wesley Armstrong
Categories:
Type: BOOK - Published: 2022 - Publisher:

DOWNLOAD EBOOK

Within the past decade, the number of publicly available microbiome sequencing samples has increased dramatically. Consequently, bottlenecks have arisen in comm
Benchmarking Continuous Phenotype Prediction with Multi-omic Microbiome Data
Language: en
Pages: 32
Authors: Patrick Imran McGrath
Categories:
Type: BOOK - Published: 2021 - Publisher:

DOWNLOAD EBOOK

Large-scale microbiome datasets from 16S amplicon sequencing provide opportunities for building predictive models with supervised machine learning to answer que
WIPO Technology Trends 2019 - Artificial Intelligence
Language: en
Pages: 156
Authors: World Intellectual Property Organization
Categories: Law
Type: BOOK - Published: 2019-01-21 - Publisher: WIPO

DOWNLOAD EBOOK

The first report in a new flagship series, WIPO Technology Trends, aims to shed light on the trends in innovation in artificial intelligence since the field fir
Microbial Environmental Genomics (MEG)
Language: en
Pages: 370
Authors: Francis Martin
Categories: Science
Type: BOOK - Published: 2022-12-15 - Publisher: Springer Nature

DOWNLOAD EBOOK

This volume guides researchers on how to characterize, image rare, and hitherto unknown taxa and their interactions, to identify new functions and biomolecules
Pattern Recognition and Machine Learning
Language: en
Pages: 0
Authors: Christopher M. Bishop
Categories: Computers
Type: BOOK - Published: 2016-08-23 - Publisher: Springer

DOWNLOAD EBOOK

This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approxi