TCT specimens were digitized to images using a microscope with a 40× objective lens (model: VS120, Brand: Olympus) and cut into 2048×2048 pixels for labeling (Fig. 1). Two clinical pathologists independently labelled the squamous epithelial cells with bounding boxes as NILM, ASC-US, LSIL, ASC-H, HSIL, and SCC using LabelImg software, which is a graphical image annotation tool (https://github.com/tzutalin/labelImg), according to the Bethesda reporting system. After labeling, all data were split into single-cell images according to the labeled coordinates of bounding boxes and rescaled to 224×224 pixels according to the shortest side (Fig. 1). To reduce the inter- and intra-observer variability, an annotated review was conducted of the original pixel images of the single cells one month following the initial labeling.
In order to prevent the problem of over-fitting, where the model may not accurately predict additional data, data augmentation on the training set using image flipping was performed. Data normalization was completed to ease the redundant image differences caused by the different environments and staining workflow of multiple hospitals (18 (link)).