We have developed a tool, called ICEfinder, available online and as a standalone version for the rapid detection of ICEs and IMEs in bacterial genome sequences. ICEfinder employs a method we called ‘Pattern-based hit co-localization’ (see the Supplementary Methods) that detects the signature sequences of the recombination modules and conjugation modules based on their profile HMMs (19 (link)) (Supplementary Table S2, S3 and Figure S4). It also searches for the oriT region using the approach proposed by oriTfinder (18 (link)). It then co-localizes, filters and groups the corresponding genes. At last, those elements carrying an integrase gene, a relaxase gene and T4SS gene clusters (12 (link),20 (link)) are considered as T4SS-type ICEs, while those without T4SS but with integrase, replication and the AICE translocation-related proteins are thought to be putative AICEs. Those without T4SS but with integrase and relaxase are tagged as putative IMEs. ICEfinder also tries to detect some particular IMEs with integrase and an oriT but no relaxase. ICEfinder employs ARAGORN (21 (link)) with the default parameters to identify the 3′ termini of the tRNA/tmRNA genes as the putative ICE insertion sites. It also uses Vmatch (http://vmatch.de/) with the default options to detect the directed repeats as the tRNA-distal boundaries. The acquired antibiotic resistance genes and virulence factors are also identified by NCBI BLASTp (22 (link)) with the cut-off of Ha-value of 0.64 (12 (link)).
The ICEfinder online tool allows users to submit a GenBank file containing a nucleotide sequence and its annotation as a query. A FASTA format file of a raw nucleotide sequence is also accepted, which is annotated using our gene annotation tool CDSeasy (12 (link)) and is then used as the input for the following ICE detection. ICEfinder uses the CGView circular genome visualization tool (23 (link)) to display the distribution of the predicted T4SS-type ICEs, IMEs and AICEs in the query bacterial genome. In addition, the ICEfinder has a comparison module (Supplementary Figure S5) that allows performing the alignment between the identified ICE loci against the ICEberg-archived ICEs using MultiGeneBlast (24 (link)).