Computational methods are commonly used to predict protein-drug (or ligand) interactions and generally are crucial tools in the design and discovery of new drugs. These methods typically search for regions with favorable energy that geometrically fit the ligand, and then rank them as potential binding sites. In this work, we have been developing methods based on sampling-based planning methods to aid in identifying protein-ligand binding sites, to compare ligands based on their suitability to bind in a particular site, and to assess how accessible a particular binding site on a protein’s surface is for a given ligand.

To view this as a motion planning problem, we consider the protein to be the environment and treat the ligand as the moving object (robot) whose goal is to reach the binding site. This approach allows us to consider the flexibility of the ligand and also the protein, if desired.

Accessibility of Protein Binding Sites

Binding site accessibility is an important feature often ignored by methods that classify binding based solely on the energetic or geometric properties of the bound protein-ligand complex. To evaluate this necessity, we transform the ligand accessibility problem into a robot motion planning problem where the ligand is modeled as a flexible agent whose task is to travel from outside the protein to its binding site. Ligands are small molecules that interact with (bind to) a protein to trigger or inhibit the protein’s activity. This mechanism works like a lock and key model, and similar ligands can bind to the same receptor. For example, caffeine binds to brain cell receptors to block Adenosine’s role of regulating sleep and local neural excitability. Caffeine is able to ‘fool’ adenosine receptors and speed up neural activity.

Coffee Caffeine

Prediction of protein-ligand interaction, or binding, is important for drug discovery research. Specifially, the research aims to answer two questions: (1) Can the ligand access the binding site of the target protein? (2) Is the protein-ligand interaction stable? We use motion planningn algorithms to study binding site accessibility. The protein is modeled as an environment and the ligand as the moving object or robot with a goal to reach the binding site.

Using skeleton-guided path planning algorithms to analyze the accessibility of buried binding sites:

We use Rapidly-exploring Random Graphs coupled with Mean Curve workspace skeletons to quickly and thoroughly explore a protein environment and find valid paths for ligand motion. We annotated the mean curvature skeleton of the protein with energy information, and then bias planning to explore regions with favorabke energy values first. We used our algorithm to analyze accessibility of the protein Haloalkane dehalogenase (dhaA) from the bacterium Rhodococcus rhodochrous, used in soil inoculation. The protein has multiple mutants whose binding activity is regulated by the accessibility of the buried binding site.

ProteinSkeleton PlanningVsCaver

Predicting protein ligand binding sites

Using OBPRM and Haptic User Input to Search for Binding Sites:

One of the challenges in haptic research is the very fast (i.e., ~1kHz) update requirements for force feedback. This limits the possible applications to very simple environments. However, we used a grid based force calculation algorithm to approximate the force feedback, hence, achieved a realistic feedback even in the complex proteins.

Our approach to ligand binding problem is as the following:

  • Generate binding site candidates
  • Create a roadmap using these candidates
  • Recognize binding sites

In generation, we used both our automated planner (OBPRM) created or used collected configurations. Since these configurations may have higher potentials then the desired, (a binding sites should have lower potential), we pushed these configuration to local minima close them. Later we used these pushed configurations to create a roadmap. In the roadmap, we chose the largest connected component. The accessibility is an important issue in ligand binding, and the larger a connected component, the more likely its nodes are accessible to outside world. In the largest connected component we used the low energy configurations as our candidate sites. Later we used our scoring function to evaluate each candidate.

ProteinGIF

Our scoring function is based on the average potential energy of a local roadmap around any given configuration.

Publications

Updated: