The discriminators’ learning rate was set to equally for and , while the learning rate for the generators G and F was set to , following the Two Times Update Rule (TTUR)28 , to improve GAN convergence under mild assumptions. Dropout regularization layers, applied in the generators, were initialized with the rate . Leaky ReLU layers were initialized with the negative slope coefficient . The loss weights , , and were set to 1, 0.5 and 0.00001, respectively.
The training used the Adam optimizer with exponential decay rates of and during 100 epochs with batches of 16 images. On average, the training took 9-11 hours per iteration, using TensorFlow (version 2.3.0) on a shared HPC workspace with an Nvidia Tesla P100 Graphics Processing Unit (GPU). The implemented code is available under the GNU license on