ResFinder 4.0 was validated with datasets consisting of MIC values (BMD or Etest, Table
1) and WGS data (Illumina sequencing) of
Escherichia coli,
Salmonella spp.,
Campylobacter jejuni,
E. faecium,
E. faecalis and
S. aureus of different origins (Table
1). These datasets represent a convenience sample. Phenotypic AST results were interpreted using the EUCAST epidemiological cut-off values (ECOFFs) to categorize isolates as WT (MIC ≤ECOFF) and non-WT (MIC >ECOFF) (
www.eucast.org). Exceptions were: (i) one
S. aureus dataset for which phenotypic AST was performed by disc diffusion and interpreted by EUCAST clinical breakpoints (Table
1); and (ii) one
E. coli dataset that consisted of Illumina WGS data only and MIC values were available for the data provider but not for the ResFinder 4.0 developers, thus providing a blind test of the tool performance (Table
1). WGS data were obtained as raw reads and processed through a quality control (QC) pipeline as described here:
https://bitbucket.org/genomicepidemiology/foodqcpipeline/. In brief, reads were trimmed using bbduk2 (
https://jgi.doe.gov/data-and-tools/bbtools/) to a phred score of 20, reads less than 50 bp were discarded, adapters were trimmed away and a draft
de novo assembly was created using SPAdes.
21 (
link) From the assemblies, contigs below 500 bp were discarded. The most important parameters that were used to assess quality of sequencing data were: number of bases left after trimming, N50, number of contigs and total size of assembly. QC parameters used as guidelines were: read depth of at least 25×, N50 of >30 000 bp and a limit on the number of contigs to <500.
WGS data (FASTQ) were used as input for ResFinder 4.0 using default parameters (≥80% identity over ≥60% of the length of the target gene) and also for SNP-based phylogenetic analysis as previously described
22 (
link) to verify the genetic diversity of the validation datasets. SNP analysis was not performed for the
Salmonella spp. dataset whose diversity was already described previously.
23 (
link) The ResFinder 4.0 output was analysed to define AMR genotypes, i.e. patterns of resistance determinants observed for each antimicrobial, in each dataset.
Genotype–phenotype concordance was defined as presence or absence of a genetic determinant of resistance to a specific antimicrobial agent in non-WT (nWT) or WT isolates, respectively. Genotype–phenotype discordance was defined either as presence of a relevant AMR determinant in WT isolates or as absence of a relevant AMR determinant in nWT isolates. All discordances were individually analysed.
Sequence data that did not derive from previous studies (Table
1) have been deposited at NCBI (
E. coli dataset from Germany: PRJNA616452;
E. faecium dataset from Germany: PRJNA625631;
E. faecium dataset from Belgium: PRJNA552025;
S. aureus dataset from Belgium: PRJNA615176) and in the European Nucleotide Archive (
S. aureus dataset from Denmark: PRJEB37586).
Bortolaia V., Kaas R.S., Ruppe E., Roberts M.C., Schwarz S., Cattoir V., Philippon A., Allesoe R.L., Rebelo A.R., Florensa A.F., Fagelhauer L., Chakraborty T., Neumann B., Werner G., Bender J.K., Stingl K., Nguyen M., Coppens J., Xavier B.B., Malhotra-Kumar S., Westh H., Pinholt M., Anjum M.F., Duggett N.A., Kempf I., Nykäsenoja S., Olkkola S., Wieczorek K., Amaro A., Clemente L., Mossong J., Losch S., Ragimbeau C., Lund O, & Aarestrup F.M. (2020). ResFinder 4.0 for predictions of phenotypes from genotypes. Journal of Antimicrobial Chemotherapy, 75(12), 3491-3500.