optiCall uses deviation from Hardy–Weinberg equilibrium (HWE) as an indicator of clustering quality. A χ2 test is used to test HWE unless sample size is small (<50 expected counts of any genotype, assuming HWE or allele counts of <100 for either allele), in which case an exact test is used (Wigginton et al., 2005 (link)). SNPs with a HWE P-value less than a given threshold (P<5×10−15 by default) are deemed to be poorly called. optiCall attempts to improve the genotype calls at these SNPs by again running a Student's t-based mixture model, but this time omitting the SNP and sample-wise prior. This rescue step is primarily implemented to give better genotype calls at SNPs where the genotype intensity clouds lie outside of the expected regions defined by the within and across sample prior. The statistical model is as described in (1) and (2), with the intensity values first transformed according to (8), to improve calling of SNPs with shifted intensities (Teo et al., 2007 (link)).

Inference is as in (2.2), by the EM algorithm. The νi are fixed at 1 for all classes except the heterozygous class, which is fixed at 1.3. The values of μi, Σi for the unknown class are fixed with identical values to (2.2).
All four classes have initial class probabilities set to 0.25, and for the three genotype classes initial covariance matrices are set to (2c/NI2 with c the cost (Arthur and Vassilvitskii, 2007 ) of a k means ++ clustering on the data, and N the number of intensity points. The transformation of intensities has accounted for shifts, and so location parameters of the two homozygous classes can be initialized to the extremes of y(1), and the heterozygous class will then fall somewhere in between, thus the μi are initialized to

where the min/max are taken over a filtered version of the intensity data, with the lowest 1% of untransformed intensity values in the x(1) direction and lowest one percent in the x(2) direction removed. is the mean of the yj over the second axis, and k is a shift parameter for the location of the heterozygous class, that takes one of three values, 0.45, 0.5 or 0.55, resulting in three sets of initial values dependent on the value of k. For each set of starting values, the EM algorithm is run until genotype calls are concordant for two consecutive iterations, and the optimal parameters are chosen to be the final values with the highest likelihood.
Genotype calls are made using genotype posterior probabilities [using the πi inferred from this step unlike (2.3)] with a 0.7 call threshold. By default, SNPs that fail the HWE test subsequent to this step have all genotypes called unknown.
In our experiments, we have found the occurrence of the rescue step, and the subsequent chances of a successful rescue, to vary with the quality of the dataset. On a number of Immunochip datasets, rescue steps tended to occur on between 3 and 10% of SNPs, with 30–50% being successful.