Illumina short reads were obtained from Short Read Archive and capillary reads from TraceDB. Reads were aligned to the human reference genome with BWA26 . The consensus sequences were called by SAMtools27 and then divided into non-overlapping 100bp bins with a bin scored heterozygous if there is a heterozygote in the bin or being homozygous otherwise. The resultant bin sequences were taken as the input of the PSMC estimate. Coalescent simulation was done by ms28 and cosi21 . The simulated sequences were binned in the same way.
The free parameters in the discrete PSMC-HMM model are the scaled mutation rate, recombination rate and piecewise constant population sizes. The time interval each size parameter spans was manually chosen. The estimation-maximization iteration started from a constant-sized population history. The estimation step was done analytically; Powell’s direction set method is used for the maximization step. Parameter values stablized by the 20th iteration, and these were taken as the final estimate. All parameters are scaled to a constant that is further determined under the assumption of a neutral mutation rate 2.5×10−8.