Journal article

FlashPCA2: principal component analysis of biobank-scale genotype datasets

Gad Abraham, Yixuan Qiu, Michael Inouye

Published : 2016

Abstract

Motivation Principal component analysis (PCA) is a crucial step in quality control of genomic data and a common approach for understanding population genetic structure. With the advent of large genotyping studies involving hundreds of thousands of individuals, standard approaches are no longer computationally feasible. We present FlashPCA2, a tool that can perform PCA on 1 million individuals faster than competing approaches, while requiring substantially less memory. Availability https://github.com/gabraham/ashpca Contact gad.abraham@unimelb.edu.au