Do you notice a mistake?
NaN:NaN
00:00
[présentation en français!]
The estimation of biological diversity is an uprising research topic in genetic analysis. The goal of metagenomics is to analyze the complex micro-organism communities that can be found from the bottom of the deep-sea to the very inside of our own bodies. The estimation of eukarotic richness and diversity in the environment has became available through the advent of next-generation DNA sequencing (NGS) for metagenetics. A new approach, called “barcoding” showed that we can accurately identify and distinguish species through very small fragments of their DNA. These hypervariable regions can be targeted by using specific primers, that allow to amplify these Small Sub-Units (SSU), usually through the 16S or 18S rRNA gene.
The recent advances in NGS technology offer an unprecedent throughput and coverage of genetic material. However, the amount of data generated by such platforms consequently raises algorithmic issues due to the average billion-sized sequence datasets. After several years of research in metagenomics, we are starting to uncover the tip of an unprecendent picture on the eukaryotic diversity. However, this explosion of research came with a plethora of approaches, variety of methodologies and heterogeneity on the concept of species. This current deluge of data have hindered the possibility of obtaining an authoritative view on biological diversity evaluation. The first studies in this research field revealed an incredible diversity that could range over thousands of different microbial species living in our bodies. However, not only the size and depth of sequencing can provoke an over-estimation of diversity, but a plentiful of biases hinders the results all along the way of genetic analysis
In-vivo biases: Intra-genomic variations and extra-cellular DNA (eDNA) survival
In-vitro biases: Laboratory contaminations, PCR-Related biases, chimeras, primers dimers
In-silico biases: Reads quality, Sequencing errors, Primers mismatches
Biochemical transformation of nucleotides: Because we are currently analyzing ancient DNA that can be tens of thousands years old, the nucleotides can undergo chemical properties changes that transform one base into another.
We delineate these biases and their effects on genetic sequence analyses. First, we show our advances in devising an ultra-fast diversity estimation pipeline and show the superiority of our approach. We then present the implications of our results on the biological interpretation of diversity. A recent ultra-deep sequencing of the Sea of Japan revealed that most of the diversity found in DNA analyses was in fact biased by eDNA. This indicates that strands of DNA from dead species would regroup in the deep-sea and can pertain for very long periods of time. This stress out the importance of developing rRNA sequencing which implies harder and much more tedious laboratory protocols.
Finally, we open the debate upon the epistemological differences between exploratory and hypothesis-driven research in this field. We raise the question of how to quantify the unknown and find new species without having any prior knowledge on their underlying sequences. The worst thing not to know is what is not known to be unknown.
July 28, 2022 00:53:05
July 28, 2022 00:50:15
Do you notice a mistake?