To evaluate the performance of OrthoPhy using real sequence data, we conducted a benchmark test provided by QfO. Among the QfO tests, those classified as generalized species tree discordance tests were used to evaluate the ortholog data set based on the RF distance between the phylogenetic trees of inferred orthologs and the species tree (the phylogenetic range of the target species varies depending on the test). In other words, a lower score in the test represents a higher concordance rate between the phylogenetic tree constructed based on inferred orthologs and the species tree. Therefore, these tests can be used to evaluate the program to construct ortholog data sets for phylogenetic analysis of species. The proteome data set “2018_4” was used for tests, but variant data were not included. We ran OrthoPhy for three types of taxonomic information: (1) analyzed species were divided into three domains (bacteria, archaea, and eukaryotes) for a total of three groups (three-domain information); (2) eukaryotes were divided into five supergroups (Amoebozoa, Archaeplastida, Excavata, Opisthokonta, and SAR), archaea were divided into two groups (TACK and Euryarchaeota), and bacteria were classified into one group for a total of eight groups (eukaryotic information); and (3) other eukaryotes were divided into five supergroups, with fungi further divided into three phyla (Ascomycota, Basidiomycota, and Chytridiomycota), and archaea and bacteria were classified into one group each for a total of 10 groups (fungi information). The phylogenetic trees of orthologs inferred with the conditions in (3), (2), and (1) were compared with the phylogenetic trees of fungi and eukaryotes and the phylogenetic tree containing species from three domains, respectively, and they were evaluated based on the RF distance.