Depending on the database structure and sequence similarity between reference sequences, the naïve LCA [36 (link)] algorithm will assign reads to different taxonomic units. To inquire how reads are assigned to the taxonomic tree for 33 bacterial pathogens (Additional file 1: Table S2), we simulated ancient pathogen DNA reads using gargammel [50 ] and spiked them into five ancient metagenomic background datasets obtained from bone, dentine, dental calculus, and soil (Table 1). The simulated reads carry a unique identifier in their header in order to differentiate them from metagenomic background sequences, which exhibit either full damage patterns or attenuated damage patterns following UDG-half treatment [51 (link)]. To simulate aDNA damage in the pathogen sequences, we applied damage profiles obtained from previously published ancient Yersinia pestis genomes with [13 (link)] and without UDG-half [18 (link)] treatment. Simulated reads were processed with the NGS data processing pipeline EAGER [52 (link)] and spiked into the metagenomic backgrounds in different amounts (50, 500, or 5000 reads). For each metagenomic background, a typical screening sequencing depth of five million reads was used.
Free full text: Click here