Nature of the protein universe |
| |
Authors: | Michael Levitt |
| |
Affiliation: | Department of Structural Biology, Stanford University, Stanford, CA 94305-5126 |
| |
Abstract: | The protein universe is the set of all proteins of all organisms. Here, all currently known sequences are analyzed in terms of families that have single-domain or multidomain architectures and whether they have a known three-dimensional structure. Growth of new single-domain families is very slow: Almost all growth comes from new multidomain architectures that are combinations of domains characterized by ≈15,000 sequence profiles. Single-domain families are mostly shared by the major groups of organisms, whereas multidomain architectures are specific and account for species diversity. There are known structures for a quarter of the single-domain families, and >70% of all sequences can be partially modeled thanks to their membership in these families. |
| |
Keywords: | domain architecture protein sequence protein structure structural genomics |
|
|