The PRL2022 genome together with various genomes of B. longum subsp. longum strains that are currently included in a variety of commercially available probiotic products (Tarracchini et al., 2022 (link); Supplementary Table S3), were subjected to a pangenome analysis pipeline (PGAP) (Zhao et al., 2012b (link)). Predicted proteome of a specific B. longum subsp. longum strain was screened for orthologous enconding genes against the proteome of the other considered B. longum subsp. longum strains employing BLAST analysis (cutoff e-value of 1 × 10−10 and exhibiting at least 50% identity across at least 80% of both protein sequences) (Altschul et al., 1990 (link)). The obtained data were then clustered into protein families, i.e., clusters of orthologous genes (COGs) employing the Markov clustering algorithm (Enright et al., 2002 (link)), by means of the method gene family (GF). Based on the presence/absence matrix encompassing all COGs identified in the analyzed genomes, unique genes present in PRL2022 genome and not in the other 10 considered genomic sequences were identified. Functional annotation of each unique gene was accomplished using the Eggnog database (Huerta-Cepas et al., 2016 (link)).
Free full text: Click here