In addition to a standard serial CPU implementation, BEAGLE includes two other CPU-based implementations that exploit parallelism in different ways. An SSE implementation in double precision uses vector processing extensions present in many CPUs to parallelize computation across character state values. Single-precision SSE vectorization has not been a BEAGLE priority as other phylogenetic tools already provide this feature (Ronquist and Huelsenbeck 2003 (link), Swofford 2003 ) and, so, is not yet available in BEAGLE. The OpenMP implementation uses multiple threads to parallelize computation across rate categories. Although finer-scale parallelization, equivalent to that achieved for GPU devices, could be attempted, it is unlikely to yield significant speedups due to the thread synchronization overhead in the OpenMP model.