Abstract: | The design of stable, functional proteins is difficult. Improved design requires a deeper knowledge of the molecular basis for design outcomes and properties. We previously used a bioinformatics and energy function method to design a symmetric superfold protein composed of repeating structural elements with multivalent carbohydrate-binding function, called ThreeFoil. This and similar methods have produced a notably high yield of stable proteins. Using a battery of experimental and computational analyses we show that despite its small size and lack of disulfide bonds, ThreeFoil has remarkably high kinetic stability and its folding is specifically chaperoned by carbohydrate binding. It is also extremely stable against thermal and chemical denaturation and proteolytic degradation. We demonstrate that the kinetic stability can be predicted and modeled using absolute contact order (ACO) and long-range order (LRO), as well as coarse-grained simulations; the stability arises from a topology that includes many long-range contacts which create a large and highly cooperative energy barrier for unfolding and folding. Extensive data from proteomic screens and other experiments reveal that a high ACO/LRO is a general feature of proteins with strong resistances to denaturation and degradation. These results provide tractable approaches for predicting resistance and designing proteins with sufficient topological complexity and long-range interactions to accommodate destabilizing functional features as well as withstand chemical and proteolytic challenge.The design of proteins with a desired stable fold and function is a much sought after goal. Although impressive recent successes have been reported in designing both natural and novel protein functions and/or structures (1–6), design remains difficult, often requiring multiple rounds of iterative improvements (7–10). In depth biophysical characterization of protein design outcomes and an understanding of their molecular basis have been limited, and these are critical for improving future designs. Combining designed function with structure is particularly difficult, in part because functional sites tend to be sources of thermodynamic instability (11, 12) and folding frustration (13–15). We investigate how an approach that considers both structure and function from the outset may be used to overcome such obstacles. Furthermore, we demonstrate how kinetic and related stabilities against denaturation can be rationally designed.A promising emerging paradigm for protein design is the repetition of modular structural elements (1, 2, 5–7, 14, 16–20). This approach can simplify the design process and build on aspects of the evolution of natural repetition in proteins, as well as incorporate the inherent multivalent binding functionality of such structures (1, 21). Internal structural symmetry, resulting from the repetition of smaller elements of structure, is very common in natural proteins, with ∼20% of all protein folds (22) and the majority of the most populated globular protein folds (superfolds) (21) containing internal structural symmetry. Recent design successes, for helical proteins (5, 6), repeat proteins (18, 20, 23) and symmetric superfolds (1, 2, 7, 16, 17, 19, 24–26) recommend the simplification of the design process by using repetitive/symmetric folds as a particularly effective strategy.The β-trefoil superfold is an interesting test case for design by repetition as bioinformatics analysis has revealed multiple and recent instances of the evolution of distinct proteins with this symmetric fold (1). The fold consists of three repeats, each containing four β-strands, and is adopted by numerous superfamilies with highly diverse binding functions (27). Our design of a completely symmetric β-trefoil, ThreeFoil (), used a hypothetical multivalent carbohydrate binding template and mutated 40 of the 141 residues (1). The mutations were based on a combination of consensus design using a limited set of close homologs (to preserve function), and energy scoring using Rosetta (28). The design was successful on the first attempt, producing a soluble, well folded, and functional monomer with very high resistance to structural fluctuations as indicated by high resistance to thermal denaturation and limited amide H/D exchange (1).Open in a separate windowDesign of ThreeFoil. (A) ThreeFoil (PDB: 3PG0) illustrating its three identical peptide subdomains (red, green, blue). (B) ThreeFoil’s secondary structure: turn (purple), β-strand/bridge (yellow), and 3/10-helix (magenta) and ligand binding residues indicated by colored circles and insertions shown in red. (C) Comparison of ThreeFoil with the independently designed Symfoil (PDB: 3O4D, 15% sequence identity), shown along (Left) and across (Right) the axis of symmetry. Backbones are colored by RMSD between the two structures (blue to white, 0–5 Å), with insertions in the loops of ThreeFoil relative to Symfoil colored red. ThreeFoil’s bound sodium shown in gray, and bis-Tris, which binds in the conserved carbohydrate binding sites, shown in cyan.Here, we use a battery of biophysical and computational methods to perform an in depth analysis of Threefoil, which shows that it has remarkably slow unfolding and folding kinetics compared with natural and designed proteins due to an unusually high transition state energy barrier. Such kinetic stability against unfolding has been studied little to date. Furthermore, Threefoil is extremely resistant to chemical denaturation and proteolytic degradation. Analyses using Absolute Contact Order (ACO) (29) and Long-Range Order (LRO) (30) as well as Gō model folding simulations (31–33) show that ThreeFoil’s resistance can be explained by the high cooperativity of its folded structure, which includes many long-range interactions. Simulations also show that nonnative interactions or folding frustration arising from protein symmetry (34) do not create long-lived traps during folding or account for the high barrier. They also explain how ligand binding can chaperone folding, which can be an added advantage of designing the fold and function together. Notably, additional analyses using whole proteome screening and other experiments show that proteins with similar resistances as ThreeFoil generally have high ACO/LRO values. Thus, the design method used for ThreeFoil and the strategy of designing folds with many long-range contacts may be useful for designing functional proteins with high resistance to denaturation and degradation, as may be needed for challenging biotechnology applications. |