We determined the necessary sample size based on previous studies [14 (link),23 (link)] by performing a power analysis with 80% power and a type I error probability (alpha) of 0.05. Finally, we concluded that a sample size of 136 would be adequate to answer our main research questions.
We considered both motor alignment (the deviation between 10 PD of exophoria/exotropia to 5 PD of esophoria/esotropia by SPCT at distance or near) and sensory status (loss of 2 octaves or more of stereopsis from baseline) when defining the successful outcome criteria. For the primary outcome, we cited a recent PEDIG study which used the “suboptimal surgical outcome” as the primary outcome. The cumulative suboptimal surgical outcome proportion of patients by 12 months was compared between the two groups using the Kaplan–Meier method. An intergroup difference and a corresponding 95% confidence interval (CI) were also calculated. Suboptimal surgical outcomes were defined as: (1) exodeviation of ≥10 PD at distance or near using SPCT, (2) constant esotropia ≥6 PD at distance or near using SPCT, or (3) loss of 2 octaves or more of stereopsis from baseline [14 (link)].
Secondary outcomes were surgical motor alignment success, stereopsis, exodeviation speculated by PACT, fusional control score, and fusional convergence parameters. Surgery motor alignment success was defined as exotropia of <10 PD and esotropia of <5 PD [30 (link)] According to the results of DRS and TNO stereopsis, we divided the patients into three groups: “Good” denotes DRS ≤ 100″ or TNO ≤ 60″, “moderate” indicates 100″ < DRS ≤ 400″ or 60″ < TNO ≤ 480″, and “nil” indicates that they were unable to recognize the cues [31 (link),32 (link)]. Stereopsis was transformed into log units for data analysis. Patients with “nil” stereopsis were assigned to the next highest 0.3 log increment level (i.e., 800 arcsec for DRS and 960 arcsec for TNO) [33 (link)]. A fusional exotropia control score of ≤2 indicates that the patient can be controlled in an exophoria state [28 (link)]. We divided the patients into two groups according to the exotropia control scores at the 12-month follow-up.
For all IXTs who completed the 12-month follow-up, we conducted a repeated-measures analysis of variance (ANOVA) to compare the difference in exodeviation and fusional convergence amplitude between the two groups. We used a two-tailed chi-square test to compare the difference in stereopsis grouping and exotropia control grouping between the orthoptic therapy and control groups. Statistical analysis was performed using SPSS software version 20.0 (IBM Corp., Armonk, NY, USA). A p value of 0.05 was established as statistically significant.