We built a protein profile for the region specific to the integron tyrosine recombinase. For this, we retrieved the 402 IntI homologs from the Supplementary file 11 of Cambray et al. (39 (link)). These proteins were clustered using uclust 3.0.617 (40 (link)) with a threshold of 90% identity to remove very closely related proteins (the largest homologs were kept in each case). The resulting 79 proteins were used to make a multiple alignment using MAFFT (41 (link)) (–globalpair –maxiterate 1000). The position of the specific region of the integron-integrase in V. cholerae was mapped on the multiple alignments using the coordinates of the specific region taken from (17 (link)). We recovered this section of the multiple alignment to produce a protein profile with hmmbuild from the HMMer suite version 3.1b1 (42 (link)). This profile was named intI_Cterm (Supplementary File 2).
We used 119 protein profiles of the Resfams database (core version, last accessed on January 20, 2015 v1.1), to search for genes conferring resistance to antibiotics (http://www.dantaslab.org/resfams, (43 (link))). We retrieved from PFAM the generic protein profile for the tyrosine recombinases (PF00589, phage_integrase, http://pfam.xfam.org/, (44 (link))). All the protein profiles were searched using hmmsearch from the HMMer suite version 3.1b1. Hits were regarded as significant when their e-value was smaller than 0.001 and their alignment covered at least 50% of the profile.