A new paradigm for diversifying protein function

The holy grail of protein design methodology is to enable the complete computational design of any arbitrarily chosen biomolecular activity. To enable such template-free design of function, we still need to learn a lot more about how function is encoded in proteins. We are therefore developing methods to design, not a handful of binders or enzymes as in all current protein design methods, but vast repertoires comprising millions of substantially different variants. We use high-throughput screening methods to isolate the functional designs and deep sequencing analysis to fully characterise these designs. Next, advanced machine-learning methods are trained to find molecular features that discriminate the best designs from the rest, and these features are then used to improve the design algorithms, leading to a continuous, unbiased and systematic approach to learn the rules for designing new biomolecular activities. We are applying this strategy to the design of new hydrolytic enzymes, single-domain camelid antibodies, fluorescent proteins, and therapeutic antibodies. Already, our methods have been able to design thousands of diverse and active enzymes and fluorescent proteins. These methods will enable rapid and effective discovery and optimisation of biomolecular activities in protein engineering, synthetic biology, and biotherapeutic design.