A model explaining the size distribution of gene and protein families
WJ Reed, BD Hughes
MATHEMATICAL BIOSCIENCES | ELSEVIER SCIENCE INC | Published : 2004
This article deals with the theoretical size distribution of gene and protein families in complete genomes. A simple evolutionary model for the development of such families in which genes in a family are formed or selected against independently and at random, and in which new families are formed by the random splitting of existing families, is used to derive the resulting size distribution. Mathematically this turns out to be the distribution of the state of a homogeneous birth-and-death process after an exponentially distributed time, which it is shown will under certain conditions exhibit the power-law behaviour observed for gene and protein family sizes.