ACCOUNTING FOR INDUCED FIT EFFECTS
A system, device, and method for predicting a docked position of a target ligand in a binding site of a biomolecule is disclosed. The prediction makes use of a template ligand-biomolecule complex structure in order to predict a target ligand-biomolecule complex structure. The system and device contain modules allowing for the prediction of a target-ligand biomolecule complex structure. A preparation module can receive information identifying a target ligand and a template ligand-biomolecule structure. A pharmacophore matcher module can identify common pharmacophores between the template ligand and the target ligand. A docking module can predict a docked ligand position of the target ligand by overlapping the pharmacophore models of the target ligand and template ligand while the template ligand is in the binding site of the biomolecule. A biomolecule modification module can modify the biomolecule to reduce clashes between the docked target ligand and the biomolecule.
This application is a continuation application of and claims the benefit of priority to U.S. application Ser. No. 16/757,267, filed on Apr. 17, 2020, which is a National Phase Application under 35 U.S.C. § 371 of International Application No. PCT/US2018/056494, filed on Oct. 18, 2018, which claims priority to U.S. Provisional Application No. 62/574,364, filed on Oct. 19, 2017.
TECHNICAL FIELDThis application relates generally to using a computer to assist in predicting a docked position of a target ligand in a binding site of a biomolecule, and relates more specifically to using a computer to assist in predicting a docked position of a target ligand in a binding site of a biomolecule that is capable of undergoing an induced fit.
BACKGROUNDBiomolecules often serve particular functions and the ability to modulate the functionality of a biomolecule can be useful for treating diseases and for engineering industrial biomolecular applications. The functionality of a biomolecule is sometimes modulated by whether and how one or more ligands are bound to the biomolecule. Biomolecules often have regions (e.g., an “active site”) where one or more ligands can bind to the biomolecule and thereby modulate the functionality of the biomolecule. For example, competitive antagonists are compounds that can bind to an active site in a biomolecule, thereby inhibiting the natural ligand from binding. Competitive antagonists prevent a biomolecule from performing its biological function, since the biological function requires the natural ligand to be bound in the active site. Similarly, non-competitive antagonists also prevent a biomolecule from performing its biological function, but do so by binding to the biomolecule and changing the biomolecule in some way (such as by changing its three-dimensional conformational ensemble) so that the biomolecule can no longer perform its biological function (e.g., changing the biomolecule's conformation such that it can no longer accommodate the binding of the natural ligand). In contrast to antagonists, an agonist can bind to a biomolecule and activate a particular function of the biomolecule (rather than inhibit the function).
When a ligand binds to a biomolecule, it is useful to know the three-dimensional structure of the ligand-biomolecule complex (the structure of both the ligand and the biomolecule when the ligand is bound to the biomolecule). The three-dimensional structure can provide information about which interactions between the ligand and the biomolecule are important for binding, thereby informing rational drug design. The three-dimensional structure can also be used to calculate the free energy of binding. Unfortunately, it is sometimes difficult to predict the three-dimensional structure of a ligand-biomolecule complex, especially when the biomolecule undergoes an induced fit effect.
SUMMARYOne aspect features a method for predicting a docked position of a target ligand in a binding site of a biomolecule. The method involves receiving a template ligand-biomolecule structure that has a template ligand docked in the binding site of the biomolecule and comparing a pharmacophore model of the template ligand to a pharmacophore model of the target ligand. The pharmacophore model of the target ligand is overlapped with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecule. A docked position is predicted for the target ligand in the binding site of the biomolecule based on a position of the pharmacophore model of the target ligand when overlapped with the pharmacophore model of the template ligand.
Another aspect features a computer system that has at least one processor, a preparation module, a pharmacophore matcher module, and a docking module. The preparation module is stored in memory and coupled to at least one processor, and is programmed to receive information identifying a target ligand and a template ligand-biomolecule structure comprising a template ligand and a biomolecule. The pharmacophore matcher module is stored in memory and coupled to at least one processor, and is programmed to identify a pharmacophore match between the template ligand and the target ligand by comparing the pharmacophore model of the template ligand to the pharmacophore model of the target ligand. The docking module is stored in memory and coupled to at least one processor, and is programmed to predict a docked ligand position of the target ligand in the template ligand-biomolecule structure by overlapping the pharmacophore model of the target ligand with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecule.
Another aspect features a non-transitory computer readable storage medium having a computer readable program that when executed on a computer causes the computer to predict a docked position of a target ligand in a binding site of a biomolecule. Making the prediction as to the docked position of the target ligand in the binding site of the biomolecule involves performing various steps. One step involves receiving information identifying the target ligand and a template ligand-biomolecule structure, using a preparation module stored in memory and coupled to at least one processor. The template ligand-biomolecule structure has a template ligand docked in the binding site of the biomolecule. Another step involves identifying a pharmacophore match between the template ligand and the target ligand, using a pharmacophore matcher module stored in memory and coupled to at least one processor. The process of identifying the pharmacophore match involves comparing a pharmacophore model of the template ligand to a pharmacophore model of the target ligand. Another step involves predicting a docked ligand position of the target ligand, using a docking module stored in memory and coupled to at least one processor. The docking module predicts the docked position of the target ligand in the binding site of the biomolecule based on a position of the pharmacophore model of the target ligand when overlapped with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecule.
In some implementations, the target ligand is selected from a plurality of ligand candidates, each of the ligand candidates being different from the template ligand. Selecting the target ligand involves comparing the pharmacophore model of the template ligand to a pharmacophore model of each respective one of the plurality of ligand candidates.
In some implementations, a plurality of template ligand-biomolecule structures is received, each template ligand-biomolecule structure having a different template ligand docked in the binding site of the biomolecule. The pharmacophore model of the template ligand is generated by combining information from each of the template ligands from the plurality of template ligand-biomolecule structures.
In some implementations, the target ligand has more than one structural conformation in its unbound state, and the docked position of the target ligand in the binding site of the biomolecule is predicted by enumerating a set of potential target ligand conformations and overlapping a respective pharmacophore model of the target ligand for each of the potential target ligand conformations with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecule.
In some implementations, predicting the docked position of the target ligand in the binding site of the biomolecule involves ignoring at least one clash between the target ligand conformation's atomic coordinates and the biomolecule's atomic coordinates. In some instances of these implementations, for each target ligand conformation, the atomic coordinates of the biomolecule are modified to reduce clashes between the docked target ligand conformation's atomic coordinates and the biomolecule's atomic coordinates, thereby creating an altered ligand-biomolecule structure comprising the docked target ligand and an altered biomolecule.
In some implementations, a re-docked position of each target ligand conformation is predicted by predicting each target ligand conformation's position in the binding site of the altered biomolecule. For each target ligand conformation, the atomic coordinates of the altered biomolecule are modified to reduce clashes between the atomic coordinates of the target ligand conformation's re-docked position and the atomic coordinates of the altered biomolecule, thereby creating a re-altered ligand-biomolecule structure comprising a re- docked target ligand and a re-altered biomolecule.
In some implementations, each altered and re-altered ligand-biomolecule structure is ranked using a scoring function. In some instances of these implementations, a subset of high-ranking target ligands corresponding to target ligands having a threshold value for an empirical activity is identified.
Frequently, scientists and engineers are aware of the structure of a template ligand 704 that binds to a biomolecule 700 (i.e., the structure of a template ligand-biomolecule complex 224), but either know or suspect that a different target ligand 706 also binds to the same biomolecule 700 (see
As described herein, the three-dimensional structure of a template ligand 704 bound to a biomolecule 700 can be used to predict the three-dimensional structure of a target ligand 706 bound to the same (or similar) biomolecule 700. Unfortunately, when a ligand binds to a particular biomolecule, the biomolecule does not always keep its original three-dimensional conformation. As shown in
Among other advantages, the prediction system and methods disclosed herein describe how to predict conformational changes that result from the induced fit effect. In particular, the system and methods describe how computational methods can be used to predict the three-dimensional structure of a target ligand-biomolecule complex 230 (comprising target ligand 706 bound to biomolecule 701, where biomolecule 701 is biomolecule 700 after undergoing conformational changes), given a template ligand-biomolecule structure 224 (comprising template ligand 704 and biomolecule 700). In some implementations, more than one target ligand 706 is analyzed, and each one is ranked based on a scoring function. The top-ranking target ligands 706 can be chemically synthesized for empirical testing. Another advantage is that in some implementations, the structure of the biomolecule in the predicted ligand-biomolecule complex 230 can be used as a modified biomolecule in rigid-receptor docking and other drug discovery techniques.
Before performing the first step 100 of the method shown in
The target ligand 706 is sometimes provided as input 222 by a user. For example, a user may know that a particular ligand (different from the template ligand 704) binds more strongly to biomolecule 700 than the template ligand 704 or has better ADME properties than the template ligand 704. In such a case, the known ligand can be the target ligand 706 that is provided as input 222 by a user seeking to know the three-dimensional structure of the target ligand 706 when bound to a biomolecule 700. Alternatively, the target ligand 706 can be selected from a plurality of ligand candidates stored in a target ligand database 214.
Referring to
The pharmacophore models used in step 100 can either be generated by the prediction system 200 (e.g., using pharmacophore generator 300) or provided as input 222 to the prediction system 200. The pharmacophore models used in step 100 need not be generated from the same source (e.g., the pharmacophore model of the target ligand 706 can be provided as input 222, while the pharmacophore model of the template ligand 704 can be generated by the prediction system 200).
If not provided as input 222, pharmacophore models like those shown in
Once every instance of a pharmacophore type is identified (e.g., instances 810 of the hydrophobic group type 800) in a molecule, pharmacophore generator 300 can be used to create a more detailed pharmacophore model by characterizing each of the pharmacophore instances based on their location within the molecule and their directionality (if applicable). There are various methods for identifying the location of a particular instance of a pharmacophore type. As one example, the location of an instance of a hydrophobic group type 800 can be defined as the weighted average of the positions of the non-hydrogen atoms in the identified instance. As another example, the location of negative and positive ionizable sites (identified using negative ionizable detector 316 and positive ionizable detector 314, respectively) can be defined as a single point located on a formally charged atom, or at the centroid of a group of atoms over which the ionic charge is shared. As yet another example, the location of an instance of an aromatic type 804 can be defined as the centroid of the aromatic ring.
Various methods also exist for identifying the directionality of particular instances of pharmacophore types. Whether a pharmacophore type has directionality can be a pre-determined setting of pharmacophore generator 300. For example, the hydrophobic group type 800 can be deemed to have no directionality component because hydrophobic interactions are frequently directionless, while the hydrogen bond donor/acceptor types (e.g., hydrogen-bond acceptors 802) can be deemed to have directionality because an interaction between this type and a biomolecule 700 frequently requires directional polar interactions along the hydrogen bond axis. Directionality of a type can be represented as a vector, as symbolized by the arrows 812 associated with the hydrogen-bond acceptor type 802 in
Referring to
A pharmacophore model can be based on pharmacophores perceived in more than just one molecule. For example, more than one template ligand-biomolecule structure 224 can be received as input 222. When more than one template ligand-biomolecule structure 224 is received, each of the structures 224 can have a different template ligand 704 docked in the binding site 702 of the biomolecule 700. In such cases, step 100 can involve generating a pharmacophore model 806 of the template ligands 704 by combining information from each of the respective template ligands 704 from the plurality of template ligand-biomolecule structures 224. Pharmacophores common to each of the respective template ligands 704 can be used to create a combined pharmacophore model. Additionally, more than one pharmacophore model 806 can be generated from the plurality of template ligands 704. In such cases, if the template ligand-biomolecule structures 224 have known binding affinities of the associated template ligands 704, then the binding affinities can be provided as input 222 and pharmacophore models of template ligands 704 can be given greater weight in the pharmacophore model if they belong to a template ligand 704 with higher binding affinity.
Once at least one pharmacophore model 806 of the template ligand 704 and at least one pharmacophore model 808 of the target ligand 706 has been generated by pharmacophore generator 300 (or received as input 222), step 100 of
Various techniques can be used for comparing pharmacophore models, with the underlying goal being the identification of pharmacophores common to both molecules being compared (e.g., common to both template ligand 704 and target ligand 706), and especially the identification of pharmacophores with similar topological arrangements and directionality. In general, the pharmacophore types common to both the template ligand 704 and the target ligand 706 can be superimposed. More than one superimposed option may be possible (e.g., when more than one instance 810 of a particular pharmacophore type is present in the template ligand 704 or the target ligand 706 or both), in which case various techniques can be used to rank the superimposition options. For example, the RMSD between the superimposed common pharmacophores can be calculated—superimposition options with lower RMSD can be more highly ranked, and the highest-ranking superimposition option (e.g., superimposition option 814 shown in
When a target ligand 706 and/or a template ligand 704 has more than one potential pharmacophore model, each pharmacophore model of the template target ligand 704 is compared (step 100) to each pharmacophore model of the target ligand 706. Such a comparison can be done serially or in parallel using the pharmacophore match detector 306.
The next step shown in
Step 102 may result in energetically unfavorable interactions (“clashes”) between the atoms in the target ligand 706 and the biomolecule 700. Clashes (e.g., clash 710 shown in
The next step shown in
Other modifications besides conformational modifications are also possible. For example, if biomolecule 700 is a protein, then clashes 710 that are between target ligand 706 and specific sidechains of biomolecule 700 may be resolved by computationally mutating the clashing sidechains, e.g., by truncating the clashing sidechains of biomolecule 700 to alanine (alanine is a relatively small amino acid that is less likely to sterically clash with a target ligand 706). The clashing sidechains of biomolecule 700 can also be computationally mutated to residues larger than alanine but smaller than the clashing residues in biomolecule 700, e.g., a leucine could be mutated to a valine, a tyrosine or tryptophan could be mutated to phenylalanine, a glutamine could be mutated to asparagine, a glutamic acid could be mutated to an aspartic acid, etc.
One or all of the above-mentioned techniques can be used to resolve clashes 710 and ultimately achieve an induced fit effect. By modifying the biomolecule 700, an altered biomolecule 701 is created that has a different three-dimensional structure (and possibly a different chemical make-up) than the biomolecule 700. The output of step 104 is the predicted structure of the target ligand-biomolecule complex 230, which comprises target ligand 706 and altered biomolecule 701.
The next step shown in
When some predicted target ligand-biomolecule complexes 230 are resolved by mutational modification using mutator 506, but others are resolved by only conformational modification (e.g., using only minimizer 404), all complexes 230 can be ranked together using a scoring function that is a function of interactions between the target ligand 706 and altered biomolecule 701. Such mutated sidechains can be restored to the original sidechain (by using mutator 506 and then preparation module 210 for minimization and/or sampling) after the modification step 104 of the process shown in
In some implementations, a subset of the top-ranking complexes listed in step 108 of
The output 228 of the method shown in
In some implementations, steps 102-110 can be repeated. For example, step 102 can be performed on the list of ranked complexes 108 in order to predict a re-docked position of each target ligand 706 (including all three-dimensional conformations of each target ligand 706) by predicting each target ligand's 706 position in the binding site 702 of the altered biomolecule 701. Alternatively, step 102 can be performed on the predicted complexes 230 that were output from modification step 104 (without ranking those complexes 230). Instead of using pharmacophore overlapper 602 to predict the target ligand's 706 re-docked position in altered biomolecule 701, re-docking can be done by optimizing interactions between the target ligand 706 and the active site 702 of biomolecule 701 (e.g., optimizing hydrogen bonding interactions, salt-bridges, hydrophobic interactions, etc.), using the interaction optimizer 604 of docking module 208. Given a re-docked position, steps 104-110 can be performed on the re-docked target ligand 706 and altered biomolecule 701 (yielding the structure of a target ligand 706 bound to a re-altered version of altered biomolecule 701). In cases where clashing residues were mutated during step 104, the original residues can be restored using mutator 506, before repeating step 104. In some implementations, this re-docking procedure can lead to more accurate structural predictions of the target ligand-biomolecule complex 230. When steps 102-110 are repeated, step 106 (involving ranking of the predicted structure of each target ligand-biomolecule complex 230) can comprise ranking all target ligand-biomolecule complexes 230, including those that have an altered biomolecule 701 and those that have a re-altered biomolecule structure (where the re-altered biomolecule structure is the result of repeating steps 102-104 in
A number of embodiments of the claimed methods have been described. Nevertheless, it will be understood that various modifications can be made without departing from the spirit and scope of the claims. For example, greater or fewer steps can be performed than are shown in
Referring to
The prediction system 200 can have a memory 202 that stores information and/or instructions. The memory 202 can store a preparation module 210 that is coupled to at least one processor 216. The preparation module 220 can be programmed to receive physical parameters, e.g., pH, temperature, and salt concentration; such parameters can be used by the preparation module 210 and can also ultimately be used by other modules, such as molecular dynamics module 502. The physical parameters can be provided by a user as input 222 to the prediction system 200. The physical parameters can inform when to make preliminary modification to the template ligand-biomolecule structure 224 and/or the target ligand 706, e.g., using the hydrogen completer 400 described below.
Referring to
The preparation module 210 can also include a missing coordinate completer 402 which can be used to predict the unknown coordinates of certain atoms when the template ligand-biomolecule structure 224 is an incomplete structure, or when restoring previously mutated residues (e.g., after modification step 104 but before performing the ranking step 106) to their original residue. The template ligand-biomolecule structure 224 can be incomplete because some empirical techniques are incapable of resolving the myriad structures adopted by floppy/flexible regions of a biomolecule, and so the input 222 of the template ligand-biomolecule complex 224 may be missing atomic coordinates for certain residues. In these situations, the unresolved regions of the incomplete structure can be resolved using the missing coordinate completer 402, which can communicate with other modules, e.g., the molecule dynamics module 504 of the prediction system 200, to predict the unknown atomic coordinates.
The preparation module 210 can also include a minimizer 404 that is capable of performing energetic minimization using classical molecular mechanics forcefields. For example, the minimizer 404 can be used to energetically relax the template ligand-biomolecule structure 224 after using the hydrogen completer 410 and the missing coordinate completer 402. The minimizer 404 can also be useful when performing step 104 of the method shown in
The preparation module 210 can also include a conformational sampling module 406. The conformational sampling module 406 can be used to sample other viable three-dimensional conformations of the template ligand-biomolecule complex 224, besides the conformation provided as input 222. The conformational sampling module 406 can contain or be coupled to molecular dynamics module 504, conformation explorer 502, and/or any other module capable of identifying alternative three-dimensional conformations of the template-ligand biomolecule complex 224. Such sampling can be especially useful when the template ligand-biomolecule structure 224 is known or suspected to be floppy/flexible but the experimental technique used to generate the template ligand-biomolecule structure 224 was only capable of resolving one or some of the myriad of potential structures.
The memory 202 can also store a pharmacophore matcher module 204 that is coupled to at least one processor 216. The pharmacophore matcher module 204 can be programmed to generate pharmacophores for a template ligand 704 and a target ligand 706 using pharmacophore generator 300. Pharmacophore generator 300 can includes various detectors that are capable of identifying pharmacophores in a molecule; the detectors can be either default detectors pre-set in prediction system 200 or can be supplied as input 222 by a user. An aromatic detector 310 can detect pharmacophores of the aromatic group type 804. Hydrophobe detector 312 can detect pharmacophores of the hydrophobic group type 800. Positive ionizable detector 314 can detect pharmacophore groups that can become positively ionized; similarly, negative ionizable detector 316 can detect pharmacophore groups that can become negatively charged. Hydrogen bond acceptor detector 318 can detect hydrogen bond acceptor pharmacophores 802; similary, hydrogen bond donor detector 320 can detect hydrogen bond donor pharmacophores. The pharmacophore detectors shown in
The pharmacophore matcher module 204 can also be programmed to identify one or more pharmacophore matches 816 between the pharmacophore model 806 of template ligand 704 and the pharmacophore model 808 of the target ligand 706, using pharmacophore match detector 306. Pharmacophore match detector 306 can use any number of algorithms to detect common pharmacophores. Matches (common pharmacophores and/or superimpositions) between the pharmacophore model 806 of template ligand 704 and the pharmacophore model 808 of the target ligand 706 can be communicated to the pharmacophore overlapper 602 of the docking module 208.
The target ligand 706 that is analyzed by the pharmacophore matcher module 204 can be selected from a plurality of ligand candidates stored in a target ligand database 214, where the target ligand database can be stored in memory 202 and coupled to at least one processor 216. Selection of the target ligand 706 from target ligand database 214 can comprise comparing a pharmacophore model 806 of the template ligand 704 to a pharmacophore model of each respective one of the plurality of ligand candidates in the target ligand database 214 and choosing a ligand candidate based on the RMSD of the superimposition of the pharmacophore model of the ligand candidate and the template ligand 704 (lower RMSD would indicate a better ligand candidate). The pharmacophore matcher module 204 can be used to create pharmacophore models for each ligand candidate in the target ligand database 214, and pharmacophore match detector 306 can be used to perceive common pharmacophores and create superimposition options.
The memory 202 can also store a docking module 208 that is coupled to at least one processor 216. The docking module 208 can be programmed to predict a docked ligand position of the target ligand 706 in the template ligand-biomolecule structure 224 by overlapping the pharmacophore model 808 of the target ligand 706 with the pharmacophore model 806 of the template ligand 704 while the template ligand 704 is in the binding site 702 of the biomolecule 700 (step 102 in
The docking module 208 can also be programmed to predict a re-docked ligand position of the target ligand 706 in the altered biomolecule 701 (e.g., after step 104 of the method in
The memory 202 can also store a biomolecule modification module 206 that is coupled to at least one processor 216. The biomolecule modification module 206 can be programmed to achieve an induced fit effect by modifying the atomic coordinates of the biomolecule 700 to reduce clashes 710 between the docked target ligand 706 and the biomolecule 700, thereby creating an altered ligand-biomolecule structure 230 having an altered biomolecule 701 and a docked target ligand 706. Biomolecule modification module 206 can include a clash identifier 500 that can identify energetically unfavorable interactions between biomolecule 700 and target ligand 706; the regions of the biomolecule 700 that have energetically unfavorable interactions (e.g., clash 710) are the regions of the biomolecule 700 that are most likely to undergo conformational changes due to the induced fit effect.
The biomolecule modification module 206 can also include various modules that are capable of resolving energetically unfavorable interactions (e.g., clash 710). For example, minimizer 404 can alleviate clashes 710 by performing energetic minimization using classical molecular mechanics forcefields to move the specific atoms in biomolecule 700 that clash with target ligand 706 (thereby creating an altered biomolecule 701). As another example, biomolecule modification module 206 can include conformation explorer 502, which can use Monte Carlo conformational searches to explore non-clashing positions of the side-chains of biomolecule 700 (e.g., rotamer optimization). As yet another example, biomolecule modification module 206 can include molecular dynamics module 504 that can typically be used after minimizer 404 has been used; molecular dynamics module 504 can use a typical molecular mechanics forcefield to simulate the biomolecule 700 with the docked target ligand 706 in the binding site 702, thereby exploring the conformational space of biomolecule 700 when target ligand 706 is docked in its active site 702. Molecular dynamics module 706 can include various sampling techniques besides simple simulation, e.g., the replica exchange technique. As yet another example, if biomolecule 700 is a protein (or another biomolecule with sidechains), biomolecule modification module 206 can include mutator 506 that can resolve clashes 710 between target ligand 706 and specific sidechains of biomolecule 700 by computationally mutating the clashing sidechains, e.g., by truncating the clashing sidechains of biomolecule 700 to alanine (alanine is a smaller amino acid that is less likely to sterically clash with a target ligand 706), thereby yielding an altered biomolecule 701.
The modules shown in
The memory 202 can also store a ranking module 212 that is coupled to at least one processor 216. The ranking module 212 can be programmed to receive the structure of each target ligand-biomolecule complex 230 from the biomolecule modification module 206, and rank each target ligand-biomolecule structure 230 (comprising the altered biomolecule 701 and target ligand 706) using a scoring function. The ranking module 212 can be useful in instances where (i) the target ligand 706 has more than one structural conformation and the method shown in
The prediction system 200 represents only one embodiment of a computer prediction system within the scope of this disclosure; other embodiments may include more or less input 222, more or less output 228, and more or less modules and components within the software and hardware of the prediction system. In addition, it will be understood that while
In some embodiments, the induced fit docking calculations can be used to evaluate compounds in drug discovery. For example, the computational approaches described above can be used as a virtual filter for screening compounds for their suitability as a candidate for new pharmaceutical applications. Referring to
Once target ligands 706 are identified, prediction system 200 can be used to predict target ligand-biomolecule complex structures 230 using generally the techniques described above, e.g., inter alia, using pharmacophore matcher 204 and docking module 208 (step 920). Generally, the prediction calculated described above may be performed across a computer network. For example, the calculations may be performed using one or more servers that a researcher accesses via a network, such as the internet.
The predicted target ligand-biomolecule complex structures 230 are then screened (step 930), e.g. using ranking module 212 to provide a ranked list 232, in order to identify candidates for chemical analysis, which involves first synthesizing the target ligands 706 (step 940) and then assaying the synthesized target ligands 706 (steps 950 and 960). Screening molecules can be performed as described above in step 108, e.g. by using a scoring function.
Synthesis typically includes several steps including choosing a reaction pathway to make the compound, carrying out the reaction or reactions using suitable apparatus, separating the reaction product from the reaction mixture, and purifying the reaction product.
Chemical composition and purity can be checked to ensure the correct compounds are assayed.
Generally, multiple different assays can be performed on each target ligand 706. For example, in step 950, primary assays can be performed from on all synthesized target ligands 706 (step 960). The primary assays can be high throughput assays that provide a further screen for the target ligands 706 rather that performing every necessary assay on every target ligand 706 selected from the computational screening step. Secondary assays (step 960) are performed on those molecules that demonstrate favorable results from the primary assays. Secondary assays can include both in vitro or in vivo assays to assess, e.g., selectivity and/or liability. Both the primary and secondary assays can provide information useful for identifying additional target ligands 706 for further computational screening.
Target ligands 706 with favorable results from the secondary assays can be identified as suitable candidates for further preclinical evaluation (step 970).
Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non-transitory program carrier for execution by, or to control the operation of, a data processing apparatus. Alternatively, or in addition, the program instructions can be encoded on an artificially-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to a suitable receiver apparatus for execution by a data processing apparatus. The computer storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
The term “data processing apparatus” refers to data processing hardware and encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can also be or further include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit). The apparatus can optionally include, in addition to hardware, code that creates an execution environment for computer programs, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
A computer program, which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code, can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program may, but need not, correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., files that store one or more modules, sub-programs, or portions of code. A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Computers suitable for the execution of a computer program include, by way of example, general or special purpose microprocessors or both, or any other kind of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Moreover, a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device, e.g., a universal serial bus (USB) flash drive, to name just a few.
Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
To provide for interaction with a user, embodiments of the subject matter described in this specification can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) or LED (light emitting diode) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input. In addition, a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user's device in response to requests received from the web browser.
Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (LAN) and a wide area network (WAN), e.g., the Internet.
The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits data, e.g., an HTML page, to a user device, e.g., for purposes of displaying data to and receiving user input from a user interacting with the user device, which acts as a client. Data generated at the user device, e.g., a result of the user interaction, can be received from the user device at the server.
An example of one such type of computer is shown in
The memory 1120 stores information within the system 1200. In one implementation, the memory 1120 is a computer-readable medium. In one implementation, the memory 1120 is a volatile memory unit. In another implementation, the memory 1120 is a non-volatile memory unit.
The storage device 1230 is capable of providing mass storage for the system 1200. In one implementation, the storage device 1230 is a computer-readable medium. In various different implementations, the storage device 1230 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device.
The input/output device 1240 provides input/output operations for the system 1200. In one implementation, the input/output device 1240 includes a keyboard and/or pointing device. In another implementation, the input/output device 1240 includes a display unit for displaying graphical user interfaces.
While this specification contains many specific implementation details, these should not be construed as limitations on the scope of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Claims
1. Canceled
2. A rational drug design method, comprising:
- identifying a plurality of candidate ligands for bonding to a biomolecular target, the target ligands being candidates for a drug associated with modifying a function of the biomolecular target;
- predicting, using a computer system, a plurality of target ligand-biomolecule structures each comprising a corresponding candidate ligand of the plurality of candidate ligands and the biomolecular target with the corresponding candidate ligand being in a docked position in a binding site of the biomolecular target, each prediction comprising: receiving, by the computer system, a template ligand-biomolecule structure, the template ligand-biomolecule structure comprising a template ligand docked in the binding site of the biomolecular target; comparing, using the computer system, a pharmacophore model of the template ligand to a pharmacophore model of the corresponding candidate ligand; overlapping, using the computer system, the pharmacophore model of the corresponding candidate ligand with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecular target; and predicting the docked position of the corresponding candidate ligand in the binding site of the biomolecular target based on a position of the pharmacophore model of the corresponding candidate ligand when overlapped with the pharmacophore model of the template ligand; and
- providing, using the computer system, a ranked list of the plurality of candidate ligands.
3. The method of claim 2, further comprising receiving a plurality of template ligand-biomolecule structures, each template ligand-biomolecule structure having a different template ligand docked in the binding site of the biomolecule, and generating the pharmacophore model of the template ligand by combining information from each of the template ligands from the plurality of template ligand-biomolecule structures.
4. The method of claim 2, wherein at least one of the plurality of candidate ligands has more than one structural conformation in its unbound state, and the docked position of the correspond candidate ligand in the binding site of the biomolecule is predicted by enumerating a set of potential candidate ligand conformations and overlapping a respective pharmacophore model of the candidate ligand for each of the potential candidate ligand conformations with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecule.
5. The method of claim 4, wherein predicting the docked position of the corresponding candidate ligand in the binding site of the biomolecule comprises ignoring at least one clash between the corresponding candidate ligand conformations' atomic coordinates and the biomolecule's atomic coordinates.
6. The method of claim 5, further comprising, for each candidate ligand conformation, modifying atomic coordinates of the biomolecule to reduce clashes between the docked candidate ligand conformations' atomic coordinates and the biomolecule's atomic coordinates, thereby creating an altered ligand-biomolecule structure comprising the docked candidate ligand and an altered biomolecule.
7. The method of claim 6, further comprising, predicting a re-docked position of each candidate ligand conformation by predicting each candidate ligand conformation's position in the binding site of the altered biomolecule; and
- for each candidate ligand conformation, modifying atomic coordinates of the altered biomolecule to reduce clashes between the atomic coordinates of the candidate ligand conformation's re-docked position and the atomic coordinates of the altered biomolecule, thereby creating a re-altered ligand-biomolecule structure comprising a re-docked candidate ligand and a re-altered biomolecule.
8. The method of claim 7, wherein providing the ranked list comprises ranking each altered and re-altered ligand-biomolecule structure using a scoring function.
9. The method of claim 8, wherein the providing the ranked list comprises identifying, using the computer system, a subset of high-ranking candidate ligands corresponding to candidate ligands having a threshold value for an empirical activity.
10. The method of claim 9, wherein the ranked list of target ligands that includes the target ligand based on the predicted dock position and synthesizing one or more target ligands from the ranked list.
11. The method of claim 3, further comprising selecting, based on the ranked list, one or more of the plurality candidate ligands for synthesis and assaying.
12. The method of claim 11, further comprising synthesizing the one or more selected candidate ligands to provide one or more synthesized candidate ligands.
13. The method of claim 12, further comprising performing at least one assay of the one or more synthesized candidate ligands.
14. The method of claim 13, further comprising identifying a clinical candidate from the ranked list of candidate ligands based on the at least one assay.
15. A computer system, comprising:
- at least one computer processor and a computer memory coupled to the at least one computer processor;
- a preparation module, stored in the computer memory, wherein the preparation module is programmed to receive information identifying a plurality of candidate ligands and a template ligand-biomolecule structure comprising a template ligand and a biomolecule;
- a pharmacophore matcher module, stored in the computer memory, wherein the pharmacophore matcher module is programmed to identify a pharmacophore match between the template ligand and each of the plurality of candidate ligands by comparing the pharmacophore model of the template ligand to the pharmacophore model of a corresponding candidate ligand of the plurality of candidate ligands; and
- a docking module, stored in computer memory, wherein the docking module is programmed to predict, for the corresponding candidate ligand, a docked ligand position of the corresponding candidate ligand in the template ligand-biomolecule structure by overlapping the pharmacophore model of the corresponding candidate ligand with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecule; and
- a ranking module, stored in the computer memory, wherein the ranking module is programmed to rank each altered ligand-biomolecule structure using a scoring function and output the ranked list.
16. The computer system recited in claim 15, wherein the docking module is programmed to ignore at least one clash between the corresponding candidate ligand's atomic coordinates and the biomolecule's atomic coordinates when predicting the docked ligand position.
17. The computer system recited in claim 15, further comprising a biomolecule modification module, stored in the computer memory, wherein the biomolecule modification module is programmed to modify atomic coordinates of the biomolecule to reduce clashes between the docked ligand position's atomic coordinates and the biomolecule's atomic coordinates, thereby creating an altered ligand-biomolecule structure having an altered biomolecule and a docked candidate ligand.
18. The computer system recited in claim 17, wherein at least one of the candidate ligands have more than one structural conformation, and wherein the preparation module is programmed to enumerate a plurality of potential candidate ligand structural conformations for the at least one candidate ligand, and each of the enumerated potential candidate ligand structural conformations is processed by the docking module and the biomolecule modification module.
19. A non-transitory computer readable storage medium comprising a computer readable program, wherein the computer readable program when executed on a computer causes the computer to rank a plurality of candidate ligands for selection for synthesis and assaying in a rational drug design method, the ranking being based on predicting a docked position for each of the plurality of candidate ligands in a binding site of a biomolecule, each prediction comprising causing the computer to perform the steps of:
- receiving information identifying a corresponding ligand of the plurality candidate ligands and a template ligand-biomolecule structure, using a preparation module stored in computer memory and coupled to at least one computer processor, the template ligand-biomolecule structure comprising a template ligand docked in the binding site of the biomolecule;
- identifying a pharmacophore match between the template ligand and the corresponding candidate ligand, using a pharmacophore matcher module stored in the computer memory and coupled to at the least one computer processor, wherein the identifying of the pharmacophore match further comprises comparing a pharmacophore model of the template ligand to a pharmacophore model of the corresponding candidate ligand; and
- predicting a docked ligand position of the target ligand, using a docking module stored in the computer memory and coupled to the at least one computer processor, wherein the docking module predicts the docked position of the corresponding candidate ligand in the binding site of the biomolecule based on a position of the pharmacophore model of the corresponding candidate ligand when overlapped with the pharmacophore model of the template ligand while the template ligand is in the binding site of the biomolecule.
20. The computer readable storage medium as recited in claim 19, wherein the plurality of candidate ligands are selected from a candidate ligand database, each of the plurality of candidate ligands being different from the template ligand, and wherein selecting the plurality of candidate ligands comprises comparing the pharmacophore model of the template ligand to a pharmacophore model of each respective one of the plurality of candidate ligands.
21. The computer readable storage medium as recited in claim 19, wherein the step of predicting an initial docked position comprises ignoring at least one clash between the corresponding candidate ligand's atomic coordinates and the biomolecule's atomic coordinates.
Type: Application
Filed: Apr 10, 2023
Publication Date: Aug 3, 2023
Inventor: Edward Blake Miller (New York, NY)
Application Number: 18/132,936