TY - JOUR
T1 - Towards reproducible metabarcoding data
T2 - lessons from an international cross‐laboratory experiment
AU - Zaiko, Anastasija
AU - Greenfield, Paul
AU - Abbott, Cathryn
AU - Ammon, Ulla von
AU - Bilewitch, Jaret
AU - Bunce, Michael
AU - Cristescu, Melania E.
AU - Chariton, Anthony
AU - Dowle, Eddy
AU - Geller, Jonathan
AU - Gutierrez, Alba Ardura
AU - Hajibabaei, Mehrdad
AU - Haggard, Emmet
AU - Inglis, Graeme J.
AU - Lavery, Shane D.
AU - Samuiloviene, Aurelija
AU - Simpson, Tiffany
AU - Stat, Michael
AU - Stephenson, Sarah
AU - Sutherland, Judy
AU - Thakur, Vibha
AU - Westfall, Kristen
AU - Wood, Susanna A.
AU - Wright, Michael
AU - Zhang, Guang
AU - Pochon, Xavier
PY - 2022/2
Y1 - 2022/2
N2 - Advances in high-throughput sequencing (HTS) are revolutionizing monitoring in marine environments by enabling rapid, accurate and holistic detection of species within complex biological samples. Research institutions worldwide increasingly employ HTS methods for biodiversity assessments. However, variance in laboratory procedures, analytical workflows and bioinformatic pipelines impede the transferability and comparability of results across research groups. An international experiment was conducted to assess the consistency of metabarcoding results derived from identical samples and primer sets using varying laboratory procedures. Homogenized biofouling samples collected from four coastal locations (Australia, Canada, New Zealand and the USA) were distributed to 12 independent laboratories. Participants were asked to follow one of two HTS library preparation workflows. While DNA extraction, primers and bioinformatic analyses were purposefully standardized to allow comparison, many other technical variables were allowed to vary among laboratories (amplification protocols, type of instrument used, etc.). Despite substantial variation observed in raw results, the primary signal in the data was consistent, with the samples grouping strongly by geographical origin for all data sets. Simple post hoc data clean-up by removing low-quality samples gave the best improvement in sample classification for nuclear 18S rRNA gene data, with an overall 92.81% correct group attribution. For mitochondrial COI gene data, the best classification result (95.58%) was achieved after correction for contamination errors. The identified critical methodological factors that introduced the greatest variability (preservation buffer, sample defrosting, template concentration, DNA polymerase, PCR enhancer) should be of great assistance in standardizing future biodiversity studies using metabarcoding.
AB - Advances in high-throughput sequencing (HTS) are revolutionizing monitoring in marine environments by enabling rapid, accurate and holistic detection of species within complex biological samples. Research institutions worldwide increasingly employ HTS methods for biodiversity assessments. However, variance in laboratory procedures, analytical workflows and bioinformatic pipelines impede the transferability and comparability of results across research groups. An international experiment was conducted to assess the consistency of metabarcoding results derived from identical samples and primer sets using varying laboratory procedures. Homogenized biofouling samples collected from four coastal locations (Australia, Canada, New Zealand and the USA) were distributed to 12 independent laboratories. Participants were asked to follow one of two HTS library preparation workflows. While DNA extraction, primers and bioinformatic analyses were purposefully standardized to allow comparison, many other technical variables were allowed to vary among laboratories (amplification protocols, type of instrument used, etc.). Despite substantial variation observed in raw results, the primary signal in the data was consistent, with the samples grouping strongly by geographical origin for all data sets. Simple post hoc data clean-up by removing low-quality samples gave the best improvement in sample classification for nuclear 18S rRNA gene data, with an overall 92.81% correct group attribution. For mitochondrial COI gene data, the best classification result (95.58%) was achieved after correction for contamination errors. The identified critical methodological factors that introduced the greatest variability (preservation buffer, sample defrosting, template concentration, DNA polymerase, PCR enhancer) should be of great assistance in standardizing future biodiversity studies using metabarcoding.
KW - 18S ribosomal rRNA (18S rRNA)
KW - high-throughput sequencing
KW - metabarcoding
KW - mitochondrial cytochrome c oxidase subunit 1 (COI)
KW - reproducibility
KW - standardization
UR - http://www.scopus.com/inward/record.url?scp=85113827516&partnerID=8YFLogxK
U2 - 10.1111/1755-0998.13485
DO - 10.1111/1755-0998.13485
M3 - Article
C2 - 34398515
SN - 1755-098X
VL - 22
SP - 519
EP - 538
JO - Molecular Ecology Resources
JF - Molecular Ecology Resources
IS - 2
ER -