Effective Labeling of Molecular Surface Points for Cavity Detection and Location of Putative Binding Sites Mary Ellen

1 Bock ,

Claudio

2 Garutti ,

Concettina

2,3 Guerra

1: Dept. of Statistics, Purdue University,USA 2: Dept. of Information Engineering, University of Padova, Italy 3: College of Computing, Georgia Institute of Technology, USA [email protected],[email protected],[email protected]

Abstract E present a method for detecting and comparing cavities on protein surfaces, which is based on a representation of the protein structures by a collection of spin-images and their associated spin-image profiles. The method is used to find a surface region in one cavity of a protein that is geometrically similar to a surface region in the cavity of another protein, in order to find an indication that the two regions likely bind to the same ligand.

W

• The horizontal profile h(s) of a spin-image s is a vector whose ith element h(s)(i) is the number of contiguous zero-elements in row i of s starting at column 0 and ending at the first non-zero cell along row i.

. . ET Nb = # blocked points, Nu = # unblocked points. In general:

L

• in a protein, Nb << Nu • in a cavity, Nb >> Nu • in a binding site, Nb >> Nu

• The data set is composed by 464 binding sites on 244 proteins from [3], where 112 are enzymes (45.9%), 129 are nonenzymes (52.9%), and three are ”hypothetical” (1.2%) proteins. Figure 2: Determination of the sphere using spin-image horizontal profile.

2.1 Cavity detection

T

2. Methods

3.1 Cavity Detection

• The sphere S(s) of a spin-image s is the biggest semicircle tangent to the left lower corner of pixel (0, 0), with radius R and with center C on the normal n, s.t. the sphere contains only empty pixels.

1. Surface Characterization HE molecular surface is a collection of spinimages, each of them associated to a Connolly’s surface point P with its normal n. Let (P, n) be the coordinate system with origin in the surface point P and with axis its normal n. In this system, every surface point Q is represented by two coordinates: the perpendicular distance α of Q to n, and the signed perpendicular distance β of Q to the plane T through P perpendicular to n.The spin-image s(P ) of a point P is a two-dimensional histogram of the quantized coordinates (α, β) of the surface points w.r.t (P, n). If βr = βmax − βmin then the number of rows is h = dβr /εe and the number of columns is k = dαmax/εe, where  is the pixel size. A surface point is blocked if its spin-image contains a non-zero pixel with positive β, otherwise it is unblocked.

3. Data and results

Figure 1: Statistics of blocked points of proteins and binding sites for a non redundant data set of 244 proteins defined in [3] .

B UILD S PHERE (s) R ← |h(s)|/2 for j = 2, . . . , |h(s)| begin i ← h(s)(j) if(i ≥ j) then R ← min{R, (i2 + j 2)/2j} else R ← min{R, (i2 + (j − 1)2)/2(j − 1)} C ← (0, (R + 1)) end

• The rank of a cavity is its size, i.e. the biggest cavity has rank 1, the second biggest cavity has rank 2, and so on.

3.2 Finding similar binding sites on two proteins

1. For a given protein surface, determine the set of blocked points B 2. Compute h(s(b)) for ∀b ∈ B 3. Determine B 0 = {b ∈ B : |h(s(b))| < 10} 4. For ∀b ∈ B 0, B UILD S PHERE (s) 5. Determine B 00 = {b ∈ B 0 : R(S(s(b))) < 1} 6. Build the undirected graph G = (V, E), where v ∈ V ⇔ b ∈ B 00, and e = (vi, vj ) ∈ E ⇔ . dist(Ci, Cj ) < Ri + Rj , where Ci = C(S(s(bi))), . Ri = R(S(s(bi))) 7. Find the connected components G1, · · · , Gn of G using Breadth First Search.

2.2 Finding similar binding sites on two proteins 1. Build the spin-image representation of the surface points of the two proteins. 2. For each protein, find the surface cavities and select the largest one(s). 3. Compare pairs of cavities, one per protein, by identifying and grouping sets of corresponding points based on the correlation of their associated spin-images, using MolLoc [1] [2]. 4. Return the regions on the two cavities that are most similar.

Given a protein P , a binding site B on P and the set of atoms S identified T on P in the comparison,T we . |S B| . |S B| define coverage= |B| and accuracy= |S| . Using just the cavities instead of the whole surfaces: • Execution times are reduced from 1–2 hours down to few minutes or even seconds • Coverage and accuracy improve up to 21 %

Figure 3: The figure plots the number of atoms of the binding sites versus the number of atoms of the ligands for all 244 proteins of the dataset. The dotted line is the least square line.

• The bigger the number of atoms of the binding site, the better the rank of the corresponding cavity. • 76% of the binding sites lie in one of the four biggest cavities.

References [1] M.E. Bock , G. Cortelazzo, C. Ferrari and C. Guerra (2005). Identifying similar surface patches on proteins using a spin-image surface representation. Proc. Combinatorial Pattern Matching CPM 2005, 417–428. [2] M.E. Bock, C. Garutti and C. Guerra (2007). Discovery of Similar Regions on Protein Surfaces. J. Comp. Biol., 14(3):285–299. [3] F. Glaser, R.J. Morris, R.J. Najmanovich, R.A. Laskowski and J.M. Thornton (2006). A Method for Localizing Ligand Binding Pockets in Protein Structures. PROTEINS: Structure, Function, and Bioinformatics, 62:479–488.

6th Annual International Conference on Computational Systems Bioinformatics CSB2007, 13-17 August 2007, University of California, San Diego

Figure 4: Distribution of binding sites by cavity rank and # atoms of binding site.

Pdb ID # residues Coverage Coverage Accuracy in binding MolLoc Cavity Cavity site [2] comparison comparison 1atp 23 78% 91% 80% 1phk 26 69% 90% 76% 1atp 23 70% 78% 75% 1csn 26 62% 80% 91% 1atp 23 26% 34% 100 % 1mjh:B 25 24% 32% 88% 1atp 23 39% 56% 92 % 1hck 24 42 % 58 % 87 % 1atp 23 43% 60% 93% 1nsf 23 35% 43% 76% Table 1: Comparison with results obtained with MolLoc [2].

binding site

3: College of Computing, Georgia Institute of Technology, USA .... Systems Bioinformatics CSB2007, 13-17 August 2007, University of California, San Diego.

2MB Sizes 1 Downloads 247 Views

Recommend Documents

Feature binding in zebrafish
fish may not support this ability at all, particularly in view of the current notion ...... a function. Philosophical Transactions of the Royal Society B, 360, 837e862.

Vault: A Secure Binding Service
In (SIP-based) VoIP services, registrars/location servers are needed to map ... location service (e.g., in wireless networks) that binds a phone number or an.

binding-visualization-exercise-student.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

The binding of isaac full
Kylie minogue kylielivein newyork.509714401.Download with idm.Johnny cash playlist the very best ofjohnny cash.The mentalist. eztv.Malwarebytes Anti-Malware keygen.The More YouGive. The host 1080 2006.Park and recreation s06e20.Owari no seraph nagoya

Shallow Binding in Lisp 1.5 - Pipeline
computer languages (with the major exception of APL) use a lexical scoping rule in ... , where the redundant variable-name simplifies programs presented ..... Knuth, D. The Art of Computer Programming, Vol.

Site Inspections.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Site Inspections.

site axon.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. site axon.pdf.

Neuropeptide binding reflects convergent and ...
Psychology Department (for courier, send to 5212 McGill Hall), University of California, San ... Available online 27 April 2006. Abstract ... Fax: +1 858 534 7190.

A ubiquitin-binding motif required for intramolecular ...
of the domains by the free intracellular pool of monoubiquitin .... 2.9 software (MicroCal). ... Burd,C.G., Mustol,P.A., Schu,P.V. and Emr,S.D. (1996) A yeast protein.

Spontaneous conformational change and toxin binding ...
Jun 17, 2008 - E-mail: [email protected] This article contains ..... Schwede T, Kopp J, Guex N, Peitsch MC (2003) SWISS-MODEL: An automated protein.