Women were eligible to participate if they reported greater than 3 continuous months of insertional (entryway) dyspareunia, pain, or both with tampon insertion, and were between 18 and 50 years of age. After informed consent, all study candidates completed a standard 115-question history and physical examination. Participants needed to fulfill Friedrich’s criteria for the diagnosis of localized provoked vulvodynia, which included tenderness localized within the vestibule confirmed by cotton swab test using the modified diagnostic criteria of Bergeron et al.7 (link) In four defined points (1:00, 5:00, 7:00, and 11:00) within the vulvar vestibule, the participants should report a mean score equal to or greater than 4 out of 10 on a numeric rating scale of pain intensity. The localized nature of pain was confirmed by finding all remaining cotton swab test points tested in the lower vagina, labia majora, and labia minora to be nonpainful, defined as a mean score equal to or less than 2 out of 10 in pain on the numeric rating scale. Eligibility required a second clinician-examiner to independently concur with the diagnosis of localized provoked vulvodynia by cotton swab test. Additionally, eligible individuals did not demonstrate any other specific neuropathology, atrophic vaginitis, dermatoses such as lichen sclerosus, or pathogens such as culture- or smear-proven Candida species or herpes simplex. Study candidates who opted not to participate or who did not meet inclusion or exclusion criteria were referred for appropriate clinical care.
Drug assignments were determined by the Department of Biostatistics using a permuted block randomization scheme by means of a computer-based random numbers generator. Identical-appearing pills and creams were packaged and distributed by the Investigational Drug Service, following the randomized sequence and identified by nonconsecutive numbers. During the blinded phase, two oral regimens were distributed: desipramine 25-mg tablets and an identical-appearing oral placebo tablet containing 25 mg lactose. Dosing began with one daily tablet for week 1, two daily tablets for week 2, three daily tablets for week 3, four daily tablets for week 4, five daily tablets for week 5, and six daily tablets for weeks 6 through 12. Participants were asked to take the oral medication at one time, preferably at bedtime. Participants were instructed to advance to a total dose of six tablets daily, regardless of point of response (pain relief). In the event of side effects, without significant medical implications, the participant was advised to decrease tablet dose by one and to remain at that dose for the remainder of the clinical trial. In the event of further side effects the reduction by one tablet was repeated on an every-7th-day basis until a tolerable dose was found. Those not able to tolerate the oral drug regimen at any dose were advised to stop the oral drug but continue the topical regimen; these participants were analyzed on an intention-to-treat basis. Two topical regimens were distributed: lidocaine 5% (buffered) in Moisturel (active agents petrolatum+dimethicone, compounded by Strong Memorial Hospital Pharmacy) and an identical-appearing and identically packaged placebo cream, pure Moisturel. Participants were instructed with aid of a mirror and given written instructions to apply the cream lightly over the painful region four times daily, every day. They were asked to refrain from cream application on the days of follow-up study visits. For the small proportion of patients not able to tolerate topical application of lidocaine, the participant was asked to continue oral therapy and was analyzed on an intention-to-treat basis. An unblinding officer and unblinding protocol were available at all times through the trial. During the blinded phase of the trial, pain “rescue medication” was provided through oral acetaminophen, 650 mg every 6 hours. The use of other analgesics, such as opioid analgesics, nonsteroidal antiinflammatory drugs, and topical “caines” were documented as protocol violations.
The primary trial end point was the tampon test, performed once weekly. Detailed methods, reliability, and convergent and discriminant validity of this measure have been reported in detail elsewhere.8 (link) Briefly, the tampon test required the participant to insert and immediately remove a tampon (Tampax Original Regular) and record the degree of pain during the entire insertion-removal experience on a 0–10 pain numeric rating scale–0 indicating no pain and 10 indicating the worst possible pain–in her Vulvar Vestibulitis Clinical Trial logbook. Instructions concerning the performance and documentation of the weekly tampon test, the daily 24-hour pain diary, and intercourse pain log were given to each participant on the first prerandomization visit by the research nurse or coordinator. All information was reviewed and recorded during weekly telephone calls by the research nurse or coordinator and later confirmed by review of the study logbook on scheduled study visits. During the prerandomization phase of the trial, eligible individuals were required to demonstrate an adequate baseline level of pain (average 4 out of 10 or greater) on the tampon test to proceed to randomization. On a daily basis during the trial, participants also recorded whether they experienced sexual intercourse in the past 24 hours. The possible responses were: 1–No, too painful; 2 –No, not interested; 3–No, no opportunity; and 4–Yes. If intercourse was confirmed, then the participant recorded her level of pain on a 0–10 numeric rating scale in the study logbook. Participants were also asked to record intensity of general pain experienced over the past 24 hours on a 0–10 numeric rating scale and to record any side-effects experienced while taking study medication. Side effects were listed individually and included a severity estimate (mild, moderate, or severe).
During scheduled study visits, participants were evaluated with physical examination, cotton swab test, Vulvar Algesiometer, a battery of health-related quality-of-life measures, and laboratory testing. All components of the examination were routinely performed by the same examiner (D.C.F.) in identical fashion to the first prerandomization visit. Cotton swab test was performed on defined points of the labia majora, minora, and lower vagina, as previously described. During pelvic examination, participants underwent a selective digital palpation of pelvic floor muscles including levator ani, obturator internus, and piriformis muscle groups. The participant received explicit instructions to focus on palpation of the muscle groups by the examiner’s fingertip while attempting to overlook coexisting entryway pain. Notation was made for each muscle group, anatomic side, and pain level on a 0–3 scale corresponding to none, mild, moderate, and severe pain, respectively. The Vulvar Algesiometer, supplied by Curnow and Morrison (Plymouth, UK), consisted of a mechanical pulse generator that drove a probe against the mucocutaneous surface of the vulva for a calibrated distance and force ranging from 176 mN to 1868 mN in eight increments.9 (link) Using a previously published technique,10 (link) four anatomic sites of the vestibule were tested and end point was defined by the method of limits with the first of two consecutively positive pain responses to probe stimulus designated as pain threshold.11 (link) Algesiometer score was computed by the summation of the pain thresholds from the four designated vestibular sites (0–28 score range with higher score corresponding with less vestibular pain). A short test battery was administered during each study visit that included the Brief Pain Inventory,12 (link) Short Form-McGill Pain Questionnaire,13 (link) and the Neuropathic Pain Scale.14 (link) In addition, a more comprehensive battery was added during weeks 0, 12, 26, and 52 that included the Profile of Mood States,15 (link) Beck Depression Inventory,15 (link),16 (link) and Index of Sexual Satisfaction.17 During every study visit, participants underwent laboratory testing that included microscopic wet mount smears, Rakoff stain for vaginal maturation index, and phenazine test tape for vaginal pH. At the baseline visit, participants underwent a pregnancy test, an electrocardiogram to evaluate specifically the QT interval, and colorimetry of the least sun-exposed skin using the Minolta CR 200. At week 12, each participant provided a blood sample for desipramine and lidocaine serum levels.
The primary end point was defined as the percent change of mean tampon-test pain of weeks (10, 11, and 12) from the mean of weeks (−2, −1, and 0), labeled as baseline. The primary analysis of this 2×2 factorial design involved fitting an analysis of covariance (ANCOVA) model to the percent change of mean tampon-test pain with the two treatment variables as the predictors while adjusting for the covariate age. Interaction of the two treatments was first tested in the ANCOVA model at the .05 level of significance. If the interaction effect was not significant, it would be dropped from the model and the conclusion would be drawn from the model with main effects only. If significant, the model with interactions would be adopted. SAS Proc GLM was used in the analysis.
If interaction between treatments was significant, a hierarchical testing strategy18 (link) was adopted as follows: the first stage would compare desipramine or lidocaine individually with placebo with multiplicity-adjusted P values. If a significant difference (one or both null hypotheses rejected) was found for either or both individual agents, the analysis would proceed to the second stage of hypothesis, which would compare the effects of the active desipramine-active lidocaine treatment with those of the double placebo. If a significant difference (null hypothesis rejected) was found for combined therapy over placebo based on the multiplicity-adjusted P value, then the final (tertiary) stage of comparison would be performed comparing combined therapy to individual therapy. In this strategy if at least one hypothesis has been rejected, then the next stage of hypotheses would be tested, and the family-wise error rate would be controlled at the .05 level.
In the case of nonsignificant interaction, the primary analysis would be based on the ANCOVA model with main effects of treatments and adjusting for age with a Bonferroni corrected alpha level of 0.025 (two-sided). The significance of the main effect of each treatment was assessed by t tests in the ANCOVA model. The aim of the primary analysis was to estimate whether each treatment was superior to placebo, and if both hypotheses held, the double treatment therapy would be most effective under the additive effect assumption of the ANCOVA model.
Twelve secondary end points were analyzed as the absolute change of mean of weeks (10, 11, and 12) from the mean of weeks (−2, −1, and 0), labeled as baseline. Statistical analysis conformed to the tampon-test approach described above. Because secondary end points were considered exploratory, no corrections for multiplicity were performed. Outcome variables and drug safety and side effect data were analyzed according to a modified intention to treat with last observations carried forward for missing data and included all participants who took at least one dose of study drug.
Power analysis was based on pilot data (Foster DC, Duguid KM. Open-label study of oral desipramine and topical lidocaine for the treatment of vulvar vestibulitis [abstract]. International Conference on Mechanism and Treatment of Neuropathic Pain. Rochester, NY, 1998). We estimated that the response would be a 20% decrease in pain from baseline for the double placebo group, a 50% decrease from baseline for each treatment used alone, and an 80% decrease when the two treatments were used together. Thus each treatment would increase the response rate by 30% irrespective of whether the other treatment was used. Power analysis for the main effects (desipramine compared with placebo and lidocaine compared with placebo) used a Bonferroni corrected 80% power level with alpha=0.025 (two-sided test), and estimated that a total of 104 participants would be needed to complete the trial. Assuming a 20% dropout rate, we therefore estimated that 130 participants would be needed.