Open Access
Peer-reviewed
Research Article
Published: December 13, 2012
As an advanced approach to identify suitable targeting molecules required for various diagnostic and therapeutic interventions, we developed a procedure to devise peptides with customizable features by an iterative computer-assisted optimization strategy. An evolutionary algorithm was utilized to breed peptides in silico and the “fitness” of peptides was determined in an appropriate laboratory in vitro assay. The influence of different evolutional parameters and mechanisms such as mutation rate, crossover probability, gaussian variation and fitness value scaling on the course of this artificial evolutional process was investigated. As a proof of concept peptidic ligands for a model target molecule, the cell surface glycolipid ganglioside G M1, were identified. Consensus sequences describing local fitness optima were reached from diverse sets of L- and proteolytically stable D lead peptides. Ten rounds of evolutional optimization encompassing a total of just 4400 peptides lead to an increase in affinity of the peptides towards fluorescently labeled ganglioside G M1 by a factor of 100 for L- and 400 for D-peptides.
A clever identification procedure is crucial when peptidic ligands for diagnostic and therapeutic techniques such as in vivo imaging or drug targeting are to be developed. Here, we present a propitious and versatile approach for the discovery of peptide sequences with custom features that is based on an iterative computer-assisted optimization process. The methodology smartly combines in silico evolution with in vitro testing to quickly obtain promising peptide ligand candidates with desired properties. To validate our method in a proof of concept we tried to identify peptide sequences that can bind to a glycosidic cell membrane component. We applied the evolution process by starting out with a small population of peptide lead sequences and achieved a constant increase in affinity between the peptide candidates and their target molecule with each generation. After 10 rounds and a total number of only 4400 peptides synthesized and tested, a more than 100fold improvement in target recognition could be achieved. Since all kinds of building blocks useable in chemical solid phase peptide synthesis can in principle be employed in this evolutionary optimization process, our method should prove a most versatile approach for the optimization of peptides, peptoids and peptomers towards a preset functionality.
In the field of bioactive substances, peptides are drawing increasing attention as they close the gap between small molecules and proteins, combining the compact size and synthetic accessibility of the former with the high specificity in molecular recognition processes of the latter. Of particular interest in this context are tasks where targeting of an active compound to a defined cellular or molecular structure is desired, e.g. the site-specific delivery of drugs, vaccines, or contrast agents for molecular imaging applications.
To date, mainly antibodies are used in such situations, yet the large size of an antibody ligand severely hampers tissue penetration and optical resolution, and its antigenicity and degradability limit its use in vivo. Hence, various approaches to artificially reduce ligand size while maintaining specificity are being pursued to establish the next generation of targeting molecules. Small peptides built up of 10 to 20 amino acid residues which permit highly specific interactions with biological targets carry this concept to its final consequence. Although the use of peptides in therapy and diagnostics may be hampered by their proteolytic lability or limited cell penetration, too, these obstacles can be overcome by building up proteolytically stable peptide isomers from D-amino acid residues or by coupling the peptides to membrane shuttles. Far more challenging is the identification of peptide sequences that exhibit the necessary sensitivity and specificity of a targeting ligand. To date, high throughput screening of large peptide libraries is a common approach for the identification of peptide ligands, but with increasing ligand length the procedure rapidly reaches its limits. Beyond a length of 9–10 amino acids such libraries are no longer representative due to the exponentially growing peptide sequence space (e.g. 10 21 sequences for 16mer L-peptides).
In order to overcome this limitation, computational structure based design methods suitable for reduction of the sequence space allocatable have been established. If the 3D molecular structure of the target is available it can be used in docking approaches for the design of peptide ligands for these targets using mere in silico procedures. Another way to optimize peptide sequences for desired applications is the use of structural scaffolds in molecular dynamics simulations. Both approaches work best with rigid proteinacious target molecules.
Structure-independent design of peptides can be accomplished by e. g. sequence motif scanning utilizing learning algorithms such as artificial neural networks. This technique, however, is limited to sequence data already present in training sets and often fails to create novelty.
In protein design, directed evolution strategies which aim to improve candidates by iterative rounds of mutations and functional screenings constitute another way to optimize biomolecules. These methods, which include gene-shuffling, site-directed mutagenesis and chimeragenesis, work on the DNA-level and hence are restricted to gene encoded optimization candidates. Therefore the incorporation of non-natural building blocks or the optimization of all D-peptides cannot be achieved with these techniques. Yet, the inclusion of a function-screening step in such directed evolution strategies represents a definite strength. In light of the above, it appears most reasonable to employ not a structure, but a function-driven strategy for the identification of peptides suitable for the desired applications.
We have devised such a strategy based on a molecular optimization process that mimics Darwinian evolution. The evolutionary process is initiated with a peptide library of random sequences or with lead peptides either of known rudimentary suitability or designed by structural considerations. The functional prowess of each peptide is assessed in an appropriate biological assay, in result of which all individuals are assigned “fitness values”. The resulting peptide population is operator-inspected and top candidates are selected to act as parent peptides for the follow up generation. In a computational step, an evolutionary algorithm (EA) is used by which the selected peptides are propagated in silico via crossing and mutating them, with the “fittest” candidates having the highest probability of passing on their “genetic information”, i.e. their peptide sequence, to produce a filial generation. We have applied this cooperative in silico and in vitro optimization methodology to identify peptidic ligands for the cell membrane glycolipid ganglioside G M1, a potential target e. g. for diagnostic imaging applications at the mucosal wall or for mucosal vaccine delivery systems.
Evolutionary optimization of peptidic ligands is a complex process where numerous parameters and different evolutionary mechanisms may depend on and influence each other. In order to keep those variables at a manageable level a general framework was defined in the beginning: i) the length of the ligand to be evolved was set to 16 amino acids, which was deemed a good compromise between synthetic accessibility and sequence space; ii) a single most relevant criterion – optimal binding to the desired target – was selected as evolutionary goal and iii) appropriate parameter settings and combinations of evolutionary mechanisms were selected on the basis of empirical in silico simulation studies. The latter was done by shaping 16mer peptides towards a defined characteristic (molecular mass) as “pseudo-fitness”. The fitness values were optimized in distance metric simulations, and the evolutionary optimization data were evaluated in order to identify settings which lead the algorithm to converge in a minimal generation count.
As evolutionary goal we decided to optimize a peptide ligand for binding to ganglioside G M1. This particular target molecule was chosen for several reasons. Firstly, carbohydrate molecules, e. g. on cell surface receptors, are a highly relevant class of biological targets