NCBI protein annotations (RefVirus) were first automatically curated (no upper case except in gene names, correction of typos, etc.). Phages for which all protein annotations was only an uninformative list (‘hypothetical protein GP1’, ‘hypothetical protein GP2’, etc.) were considered as unannotated. These protein annotations were first combined to determine an annotation for PHROGs. To refine this annotation, the 38 880 PHROGs were compared to different databases. PHROG profiles were compared to Pfam domains (version of jan 2018; 19 (link)) and UNICLUST (20 (link)) and individual viral protein were compared to proteins in KEGG Orthologous groups (KOs; version of jan 2018) using MMseq (bit-score>50, coverage >50%). Manual curation of the collected annotations and similarities allowed to extract a single annotation per PHROG.