Micron, Vol. 25, No. 5, pp. 439-446, 1994

Pergamon

Copyright © 1994 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0968-4328/94 $7.00+ 0.00

0968-4328(94)00033--6

Testing the Quality of Electron Microscope Mapping Data for DNA Molecules with Sequence-specific Ligands ALEXEY A. PODTELEZHNIKOV, ALEXEY V. KURAKIN, ALEXANDER V. VOLOGODSKII and DMITRY I. CHERNY* Institute of Molecular Genetics Russian Academy of Sciences, Kurchatov's Sq., 123182 Moscow, Russia (Received 12 February 1994; Revised 11 July 1994)

Abstraet--A procedure for the testing of Electron Microscope (EM) mapping data for DNA molecules with site-specific bound ligands is suggested. The difficulty of distinguishing DNA molecule ends on electron micrographs indicates that their true orientations are not known. This in turn presents problems in obtaining correct maps relating to their alignment, and complicates checking the maps' validity. For these reasons a computer simulation of the EM study of double-stranded DNA molecules with site-specific bound ligands was carried out. The knowledge of the true orientations of the simulated DNA molecules allowed us to examine their final orientations after alignment. We used the number of improper-oriented molecules as the quantitative measure of the map quality. Detailed investigation based on this parameter permitted us to invent the criterion for the map validity, and to suggest the procedure for the testing of alignment of real DNA molecules. This procedure implies multiple randomization of initial orientations of the DNA molecules and minute analysis of the final maps. Most of the molecular, statistical and experimental parameters inherent to EM investigation of site-specific binding, such as the number of specific binding sites (N), the mean number of bound ligands (A), the length of the DNA molecules (L), the specific/non-specific ratio of binding (K), together with the standard deviation of DNA molecule lengths (HL) were tested for their influence upon the quality of EM mapping data. An empirical equation for the ultimate values of these parameters has been found, allowing us to predict the success of EM mapping.

Key words: Electron microscopy, DNA, DNA-protein interaction, DNA-ligand interaction, mapping, alignment.

INTRODUCTION Electron microscopy has proved to be an extremely valuable method for studying site-specific interactions of proteins and synthetic oligonucleotides with double stranded DNA (Giacomoni et al., 1977; Kadesh et al., 1980; Cherny and Alexandrov, 1982; Kramer et al., 1987; Theveny et al., 1987a; Mignotte et al., 1988; Johannssen, 1988; Kurakin et al., 1991; Le Cam et al., 1991; Cherny et al., 1993a,b). EM allows us to throw light on the process of the specific complex formation and to determine the positions of the specific binding sites (Vollenweider and Szybalskii, 1978; Cherny and Alexandrov, 1982; Mignotte et al., 1988; Kramer et al., 1987; Le Cam et al., 1991), affinities of the ligand to the specific sites over a range of two orders of magnitude (Williams and Chamberlin, 1977; Kadesh et al., 1980; Cherny and Alexandrov, 1982). In addition, the determination of the absolute values of the association and dissociation rate constants (Williams and Chamberlin, 1977; Giacomoni et al., 1977; Kadesh et al., 1980), also the presence of local DNA helix violations (if present) in the binding site (Travers, 1989; Le Cam et al., 1991). Recently it has been demonstrated that the application of EM methods has provided the tools for gene and genome mapping (Kurakin et al., 1991; Cherny et al., 1993a,b; Revet et al., 1993; Cherny et al., 1994). The first step in the analysis of the site-specific binding * Corresponding author. 439

of any ligand with DNA visible in the EM, is the determination of the positions of the specific sites alongside the DNA molecule (i.e. mapping). This information can be obtained from the binding map consisting of different peaks of various heights and widths. The center of every peak corresponds to the position of the specific site (or sites if they are positioned too close), while the size of it is assumed to correspond to the affinity of the ligand to the particular site. If a finite number of these molecules is presented as arrays of 0s and Is, where ls correspond to the DNA-ligand complexes, the binding map might be obtained as their sum in proper orientations. However, several problems exist with obtaining the true binding map. The ends of DNA molecules are indistinguishable on EM micrographs, together with the existence of nonspecific binding, variation of ligand's affinity to different specific sites and experimental errors of measurements. Taken together, these factors result in some ambiguity of EM mapping and poses a problem with alignment validity. In order to avoid ambiguity the different EM mapping techniques have been described. One of them implied specific labeling of only one end of the DNA molecules (Theveny and Revet, 1987b). In another method the authors used two sets of linearized DNA fragments cut at two different unique sites (Naumova et al., 1981). However, these techniques have been tested for short DNA fragments, but it is practically impossible to apply them to long pieces of genomic DNA.

440

A.A. Podtelezhnikovet al.

Common analysis of the EM data implies a multi-step comparison of DNA molecules in two orientations (forward and backward) for their best fit, which is normally performed on a computer with the aid of homemade software. A number of algorithms for the alignment of partially denatured DNA molecules have been described (Young et al., 1974; Borovik et al., 1980; van Dijken and Coetzee, 1981), and analyzed for the reproducibility which have provided reliable results (van Dijken and Coetzee, 1981). Over a period of time we have studied with the aid of electron microscopy the different features of DNA and its complexes with sequence-specific ligands (Borovik et al., 1980; Naumova et al., 1981; Cherny and Alexandrov, 1982; Lyamichev et al., 1983; Kurakin et al., 1991 ; Cherny et al., 1993a,b). For these studies we developed homemade software with the alignment algorithm, which is practically identical to that used for this work, thus accumulating considerable experience in the specific field of EM mapping. Occasionally we encountered the situation when the collection of all the DNA molecules could not be properly aligned. The possible reasons for this could be attributed to the inadequate set of experimental parameters mentioned above. In order to rationalize this uncertainty, a thorough computer simulation of the EM study of the DNA molecules carrying site-specific bound ligands was accomplished. Knowledge of the true orientations of the simulated DNA molecules allowed us to inspect their final orientations after alignment. Molecular, statistical, and experimental factors inherent to these types of D N A ligand molecules were examined for their influence upon obtaining valid maps. A procedure and empirical equation for the testing of the quality of EM mapping data for real DNA molecules with site-specific bound ligands is proposed.

MATERIALS AND METHODS

Simulation of the D N A ligand complexes Each DNA molecule with bound ligands was represented as a linear array, referred to below as a DNAarray, of length (m) and consisted of 0s and Is, where 0s corresponded to the absence of the bound ligand and ls corresponded to its presence. The length of the array (m) was usually equal to 100. This representation is similar to that obtained by the normalization of the measured length of DNA molecules to the unit divided into (m) segments of equal length, where the segment with bound ligand is designated by unity. At the first stage we simulated the DNA-arrays with bound ligands. The ligands were placed on the DNAarrays in accordance with the distribution of binding probabilities as described by Lukashin et al. (1976). The positions of the specific sites were chosen arbitrarily. The mean number of bound ligands per single DNA molecule (A) was chosen as a parameter. The distribution of probabilities of binding was determined by the positions

of the specific binding sites and the specific/non-specific binding ratio (K). Though this procedure simulated DNA molecules with bound ligands in a non-reversible manner, it was considered to be correct, because the final values of specific and non-specific bindings were true. At the second stage we introduced the errors of measurements. As a parameter we used the relative value (HE) of standard deviation (SL) of the total lengths (L) of DNA molecules, which is simply equal to SL/L. It was found empirically that S L = C x L 1/2 (Davidson et al., 1971), where C was a constant. The value of C varied about 2-fold for different EM procedures (Davidson et al., 1971; Hirsh and Schlief, 1976; Borovik and Cherny, unpublished observations). Variation in DNA length results in the appearance of peaks of finite width on the map. The width of the peak is related to the error of determination of ligand(s) position(s). The alignment is usually performed with DNA molecules of normalized lengths. For such molecules the width of the peak for a single binding site can be determined from the following formula: Dz=H ~ [Z(1--Z)],

(1)

where D z is the dispersion of the peak, Z is the position of the center of the peak, 0 < Z < 1 (Cherny and Alexandrov, 1982). From the equation it can be concluded that the widest peak should be in the center of the map and its width equal to half of the standard deviation for the whole length of DNA molecules. In order to simulate these errors we used the iterative procedure, which was applied for each DNA-array of a constant length. The first, closest to one end, bound ligand at position Z1 was randomly shifted with the dispersion D~I determined by eqn (1). Every subsequent bound ligand was successively shifted in two different ways as follows. First, it was shifted several times in the same directions as the preceding ones, with less magnitude proportional to their relative positions. Then second--at r a n d o m - - s o the dispersion of its total shift was equal to that determined by eqn (1). These successive shifts were needed not to invert the positions of closely located ligands, but also to take into account the correlation of their shifts in the normalized DNA-array. The orientations of DNA-arrays up to that moment were stored and considered further as being correct. In the last stage all the DNA-arrays were arbitrarily re-oriented to provide a ramdom set of their initial orientations. Thus, after the alignment was completed, the number of improperly oriented DNA-arrays could be determined. A lignment procedure The main idea of any algorithm of orientation is the conjecture that the scalar product of the DNA-array and the map is maximum for the right orientation of the DNAarray. To achieve this, at each cycle of orientation we calculated a current map simply by summarizing all DNA-arrays in their current orientations, (i.e. we summarized 0s and ls in the corresponding positions), then smoothed the map (see below) and calculated the scalar

Testing of EM mapping

product for each DNA-array in two orientations, back and forward, with the current smoothed map. The orientation, which gave rise to the higher value of the scalar product was used for the next current map. The first current map was obtained from the DNA-arrays taken in their initial orientations. The procedure was stopped when no one DNA-array had been inverted, then the last obtained current map was analysed as the binding map. The smoothing of the map was achieved by its convolution with the Gaussian of the dispersion D~, determined by eqn (1), where H L was chosen equal to 0.02. The smoothing was used to speed up the alignment procedure.

Computer experiment All the parameters of a real EM experiment on site-specific mapping, which can influence the final result, can be divided into two groups: those which are not affected significantly by the experimenter and those, which he could choose during the experiment. The first group comprises the following parameters: the number of specific binding sites (iV), the length of the DNA molecule (L), the relative value of the standard deviation of the DNA molecule lengths (HL) and the specific/non-specific binding ratio (K). The last parameter is equal to the ratio of association constants for specific and non-specific binding respectively. For the computer experiments we used the K/L ratio which is more convenient than K and L separately. Having varied Nfrom 4 to 32 and K/L from 0.25 to 16 we obtained a set of distributions of binding probabilities for the procedure of the simulation of the DNA-ligand complexes. The K/L range 0.25-16 corresponded to the L range of 1,000-16,000 bp and Krange of 250--16,000. The positions of the binding sites were chosen arbitrarily for the given value of N and did not change during the variation of other parameters. H L was varied from 0 to 0.09. We also attributed the mean number of bound ligands per one DNA molecule (A), which in general can be chosen by the experimenter, to this group we used three values equal to 0.SN, N and 2N. For each set of these parameters (N, L, K, HL and A) 8 independent collections of DNA-arrays were generated and aligned as described above. The number of improperly oriented molecules was determined for each collection, so consequently did its mean value (n). We studied the dependence of n on the parameters noted above, which was assumed to be useful for predicting the map validity quantitatively. The randomization of initial orientations of DNAarrays for each collection was usually performed more than five times. It should be noted that the number of improper-oriented DNA-arrays after alignment should depend only on their initial orientations, due to variations in the first current map (see alignment procedure). Other parameters are chosen during the measurement stage: they are the number of considered DNA molecules, (i.e. the volume of sampling), and the length of the DNA-

441

array (m). It is evident that the more DNA molecules the better the map. However, for certain technical problems it is not convenient to consider more than 100 molecules or divide DNA molecules into more than 100 segments. Usually, unless otherwise stated, we used a value of 100 for both parameters.

Electron microscopy We used the data of the EM study of the binding of the methylase BspR1 (Mw 55 kDa, recognition site GGCC) with the linear plasmid DNA pUC19/EcoR1, which were published by Kuratin et al. (1991), (see Figs 2B and 3B). The complexes were obtained by incubation of DNA (10 Ixg/ml) and enzyme (10 lxg/ml) in a buffer containing 20 mM tris-HC1, pH 7.5, 50 mM NaC1, 1 mM DTT, 1 mM Na3EDTA, 50 IxM S-adenosylhomocysteine at 37°C for 40 min. Heparin was then added to a final concentration of 10 lag/ml for 2 min and finally 20 volumes of 10 mM tris-HCl, pH 7.5, 50 mM NaC1 were added. The complexes were adsorbed onto carbon films activated by glow discharge in pentylamine, according to Dubochet (1971). The samples were stained with a 1% aqueous solution of uranyl acetate and shadowed with Pt/C (95/5). The micrographs of DNA-methylase complexes were digitized and analyzed with the aid of an HP9825B computer.

5

<

C

0

20

40

60

80

100

Length (%) Fig. 1. Influence of the initial orientations of DNA-arrays on the final map. All the maps were obtained for the same collection of DNA-arrays. The parameters were the following: N = 4 , K/L = 0.5, H L= 0.04; the coordinates of specific sites were 10%, 40%, 67% and 83%. The fraction of improperly oriented molecules n was 5% (A) and 7% (B); these maps were obtained for 90% cases of random initial orientations and are assumed to be valid. (C) represents one of the diversed maps obtained for the rest cases (10%) of random initial orientations; the fraction of improperly oriented molecules was 43 %.

442

A . A . Podtelezhnikov et al.

5

,~ 3

3 1

0

20

40

60

80

! 00

Length (%) Fig. 2. The binding m a p s of the methylase BspRl on p U C I 9 / EcoRl D N A (data were taken from Kurakin et al., 1991, Fig. 3B). Fifty-one D N A molecules carried 235 b o u n d enzyme molecules were used (A = 4.6). The positions of the specific sites are: 0.3%, 4.1%, 15.2%, 37.0%, 47.0%, 49.9%, 67.0%, 83.1%, 83.8%, 84.2% and 90.7%. (A) and (B) represent the maps obtained for 94% cases of r a n d o m initial orientations and differ in orientation of 4(A) and 3(B) D N A molecules from the published one; they are assumed to be valid. (C) represents one of the diversed maps obtained for the rest cases (6%) of initial orientations; the fraction of improperly oriented molecules was about 50%.

RESULTS In most casts the main aim of the EM study of sitespecific interaction of any ligand with double stranded D N A are: mapping of the binding sites and determination of their number, determination of the ligand affinities to different sites and rate constants. A significant problem associated with EM mapping is that the ends of the D N A molecules recorded on the micrographs are indistinguishable. In order to overcome this problem an alignment procedure is used, which is applied for a finite number of D N A molecules. A result of the alignment is a m a p consisting of a set of peaks of various heights and widths. The peak centers correspond to the positions of the individual specific sites or groups of close-positioned sites, while the peak sizes are assumed to correspond to ligand affinity to the individual sites or groups of sites. The alignment is complicated by the presence of nonspecific binding, variations in ligand affinity to the different sites and experimental errors of measurements. Let us assume that for a particular EM experiment we have established the true orientation for each measured D N A molecule and consequently, the true map. If we align these molecules by the alignment procedure we have

0

20

40 60 Length (%)

80

100

Fig. 3. Influence of the number of DNA-arrays on the final map. The positions of the specific sites were: 10%, 40%, 67% and 83%. The parameters were the same as described in Fig. 1. (A) The m a p was obtained for 100 DNA-arrays, n = 5 % ; (B) 50 DNA-arrays, n - 12% and (C) 25 DNA-arrays, n = 12%. There is no significant difference between (A) and (B), while the third m a p (C) has a different pattern. The m a p on (C) was obtained for 50% cases of various initial orientations.

employed, some molecules will have the wrong orientation due to dependence of the first current map on their initial orientations. In the real EM experiment we do not know the true orientations and cannot establish incorrectly oriented molecules. The thorough simulation of EM mapping helped us to overcome this uncertainty. The number of improperly oriented DNA-arrays or D N A molecules (n) was found to be the quantitative measure of the map quality.

Regularities of alignment It is evident that the fewer molecules in the wrong final orientation the better the final map. For all the molecules to have the true orientation strong restrictions must be applied to the experimental parameters. Our computer simulation experiments have shown that the map can be considered as being sufficiently correct if fewer than 10-20% of DNA-arrays are in wrong final orientations. When the number of specific sites was fewer than 6, up to 25% of molecules could have the wrong orientations. We decided therefore, to analyze the changing of the m a p pattern depending upon the variation of initial orientations of the DNA-arrays.

Common regularities of alterations in the maps Changing the initial orientations of the DNA-arrays by

Testing of EM mapping randomization resulted in two types of map alterations. Alterations of the first type did not change the map pattern significantly: all the peaks were retained at the same positions, though their shapes were slightly varied. Only these small alterations were observed while no more than 5-10% of DNA-arrays had the wrong final orientations. When the fraction of improperly oriented DNA-arrays was more than 25%, the alterations of the second type were observed. In this case the pattern changed drastically so that some peaks disappeared and some new ones appeared. We were surprised to find that the randomization of the initial orientations of DNA-arrays reported here, appeared as results of a few different map patterns. The most unexpected cases were observed when for one round of randomization only 10-25 % of the DNA-arrays had the wrong final orientation, and for another r o u n d - more than 30%o. It meant that for a particular collection of DNA-arrays we obtained correct as well as fully incorrect maps. Figure 1 illustrates these observations. If we ignored the small variations in the maps, we considered these maps as being identical and analyzed them on the reproducibility. The maps with fewer than 10% of improperly oriented molecules were fully reproducible. On the other hand, only once did we observe the wrong map, which was reproducible for 70 + 10% cases of random initial orientations of DNA-arrays. To clarify the phenomenon of wrong orientation, and to understand better those maps which we considered as identical, we performed the detailed analysis of the final orientations of DNA-arrays.

443

mutual orientations of the groups were random. That is why we observed a finite number of final map patterns. Discrimination of those groups in real EM experiments is possible, but useless.

Criterion of map validity As mentioned above, variations in initial orientations of DNA-arrays resulted in small variations of the map, which were caused by the fluctuations of the number of wrong-oriented molecules. Let us consider the map, which was obtained as a sum of DNA-arrays of length m as m-dimensional vector. We can now introduce the norm of the map as the sum of absolute values of the coordinates of the vector: IIMII = ~ IM, I.

The norms of all maps obtained for a particular collection of DNA-arrays is equal to each other, because the map is the sum of DNA-arrays with non-negative elements. Consider the norm of difference of two maps, M and M'. IIM-M'II < 211all < (4 mJno)llMII,

In the case of small variations of the final maps, we have found that the major part of improperly oriented molecules for different initial orientations was the same. Analysis has shown that most of these DNA-arrays had some non-specifically bound ligand(s) in the positions symmetrical to the specific ones. Final orientations of the minor part of incorrectly oriented molecules depended upon initial orientations of the whole collection of DNAarrays. It signified that the total number of incorrectly oriented molecules had small fluctuations, which were responsible for the changes of the maps (Fig. 1). A relationship between the number of incorrectly oriented molecules (n) and its fluctuation (m,) was found: m,-- (0.13 _ 0.04)n.

(2)

From eqn (2) we can evaluate the number of improper oriented molecules in a real EM experiment. Clearly we cannot evaluate it directly, because the true orientations are not known. However, by observing small variations of the map or changes in orientation of DNA molecules we can estimate m, and, consequently n. It was found that the large alterations of the map pattern could be described in another way. All the DNAarrays could be divided into several large groups. Each group contained 10-30 DNA-arrays, in which all DNAarrays had similar final orientations--right or wrong. The

(4)

where a is equal to the sum of all DNA-arrays, which had changed their orientation, n o is equal to the total number of DNA-arrays. By substituting m,/n = 0.13 (see eqn (2)), and n/n o =0.2, because we are seeking 'good' maps, we will obtain: JIM-M'II <0.111MII.

Analysis of the final orientations of the DNA-arrays

(3)

i=1

(5)

It is evident that the maps totally inverted are identical. This means that before checking the identity of the maps, we have to obtain the maximum of their scalar product. Recalling the results of the analysis of the map reproducibility (see common regularities of alterations in the maps, above), we can postulate the criterion for the validity of the map: the maps, which are equivalent in terms of eqn (5) are valid when and only when they are obtained for 80+ 10% cases of random initial orientations. Thus, in order to verify the experimental map the following procedure is recommended: --randomizing the initial orientations of the DNA molecules at least 5 times and obtaining the maps for each round of randomizing; --isolating all groups of maps which are equivalent in terms of (5); - - i f there are more than 80% of maps similar then any of them can be regarded as valuable. This procedure allows us to verify any EM experiment regardless of whether the ratio of specific/non-specific binding, or the length of molecules or errors of measurements is known or unknown.

Testin9 of the array of real DNA-protein complexes The procedure of testing described above was applied

444

A.A. Podtelezhnikov et al.

to the real EM data. We used our previous results obtained for the collection of 51 pUC19/EcoR1 plasmid DNA molecules bound with the methylase BspR1 (Kurakin et al., 1991, Fig. 3B). For that collection we found that for 94% cases of various initial orientations the maps were similar to that in eqn (5). It means that the published map was valid. From our results we calculated that m, = 2 . 1 % , thus, from eqn (2) it is possible to evaluate the percentage of improperly oriented molecules equal to n = 16 + 5%. The conditions for obtaining the valid maps

If all the parameters of the EM experiment (K,L,N,A, HL) are known, it is not essential to follow the procedure described above, and another algorithm of verification can be suggested. Below, we will describe how the parameters of the EM experiment influence the validity of the map. Before this however, we will substantiate our choice of the number of DNA-arrays and the length of the array. How many molecules should be considered?

Our results indicate that at least 50 DNA-arrays should be considered in order to obtain a valid map in terms of eqn (5), though the fraction of improperly oriented DNAarrays was retained at the same value even for a fewer number. The decrease of sampling resulted in drastic deterioration of the map pattern, which is illustrated on Fig. 3. What should be the length of the DNA-array?

As expected, the fraction of improperly oriented molecules did not depend on the length of the DNA-array (m), which was varied up to 170. The main reason for the appropriate choice of m is the required accuracy of determination of the specific site position. All the experiments described below were carried out with one hundred DNA-arrays with length equal to 100. What should be the parameters K,L,N,A and He?

As mentioned above, the parameters K,N,L,A and H g do not depend on the experimenter's will. We attempted to determine their limit values, which still allowed us to obtain a valid map. First we describe their influence qualitatively. The increase of the error of measurements, HL, results in broadening of the peaks on the map. In turn, this results in deterioration of the map pattern, the adjacent peaks become unresolvable. That explains why it is not possible to determine the positions of the specific sites for a random sequence if H L > 1/N. Nevertheless, the fraction of improperly oriented molecules may be not too large, e.g., if all the specific sites are close together, which was the case we encountered by analyzing the specific complexes DNA/methylase BspR1 for the 12 kb-long fragment of Na ÷,K +, ATPase human gene, which contained at least

10 recognition sites out of 40 within a 400bp-long segment (data not shown). On the other hand, the wrong map could also be obtained if only one specific ligand was bound with each DNA molecule. It is evident that this might result in the appearance of different maps depending upon initial orientations of DNA molecules. These maps would vary in relative positions of the peaks, though the positions of some peaks would be true. We assumed that not all the DNA-arrays or DNA molecules would have the correct orientation after the alignment. This was not considered a drawback of the procedure. Moreover, for any alignment procedure a part of the DNA-arrays or DNA molecules would have wrong orientations. The mean fraction of improperly oriented molecules was assumed to be the probability for a molecule not to be oriented properly. These molecules or DNA-arrays appeared, for instance, when non-specific binding had occurred in the sites symmetrical to the specific ones (in the limits of errors). In order to estimate the limit values of the experimental parameters we were looking for the dependence of n on K,L,N,A and H m. Due to large alterations in the map pattern for n > 20%, we were able to achieve this correctly only for n < 20%. Even in these examples it was practically impossible to establish an analytical dependence of n upon K,L,N,A and H z. For this reason we were looking for their relationship for when n was equal to 5% and 15%. Figure 4 represents this relationship which has been found empirically. If all the parameters are known the experiment will be successful if the following inequality is fulfilled: 2 . 3 5 L n ( K N A / L ) + 2 5 . 1 / N - c r > 12.5

(6)

where cr is equal to a percentage expression of H m. From the data the following empirical equation can be deduced: n = 7 7 . 5 - 11.75Ln(KNA/L)-- 125.5/N+ 5.0cy

./

.

./<

6--

,a

b 4

~//

Y 2

o

_ /"

//o (k/L)NAexp (10.8/N)

Fig. 4. Relationship among the parameters of the EM experiment. K-specific/non-specific ratio, N---number of specific sites, A l m e a n number of bound ligands, L l l e n g t h of DNA molecule, b.p., ~--the relative value of the standard deviation of the DNA molecules lengths, % ( ~ = 100 HL). The upper line corresponds to n = 1 5 % , the lower n = 5 % , where n = m e a n number of improperly oriented molecules. The lines were drawn by the square root method. The parameters (N and A) had the following values: 16 and 32, 16 and 16; 16 and 8; 8 and 16; 8 and 8; 8 and 4; 4 and 8; 4 and 4 and 4 and 2 respectively.

(7)

Testing of EM mapping

where n and cr is expressed as a percentage and L in b.p. All the coeffÉcients were determined with an error of 7%. This led to the error of determination of n of about 14% for 3 < n < 20%. This formula was established for the K/L ratio varied from 0.25 to 16, N from 4 to 16 and A equal to N, 2N or 0.5N. A simple explanation for the KNA/L and K/L expressions can be given. When the number of specifically bound ligands per single DNA molecule (As) is relatively small, KNA/L is approximately equal to A s, while K/L is equal to AJNA,s, where A,~ is equal to the number of non-specifically bound ligands per single DNA molecule. This means that the K/L ratio corresponds to the 'noise' on the maps due to non-specific binding. In our calculations we used the fixed value for the specific/non-specific ratio, K. However, site-specific binding proteins have a different affinity to the target sequences depending upon their vicinities. For example, the affinity of the methylase BspR1 to different sites varies about 10-fold (Kurakin et al., 1991) as well as its activity (Sagitov and Alexandrov, 1988). To test this case we generated the array with eight specific sites; for three of them the K/L ratio was equal to 0.48; for two---2 and for the other two--4. The mean number of bound ligands (A) was chosen as 8. It was found that the dependence ofn was the same for that observed for N = 8, A = 8 and K/L = 2 for all the sites. It was concluded that the variation of K/L can be replaced for its mean value.

DISCUSSION An electron microscopic study of collection of DNA molecules having an identical base sequence with sitespecifically bound ligands allowed us to determine the positions of the specific sites alongside DNA molecule via mapping. We have shown that the results of mapping after alignment of simulated DNA-arrays is not a unique map due to molecular, statistical and experimental factors inherent to these types of DNA-arrays, and consequently the DNA molecules. Multiplicity of the binding maps is a consequence of these factors rather than weakness in the alignment algorithm. Multiplicity of the maps obtained in this way for these types of DNA-arrays with known orientations poses a problem. Which map should be considered a valid one? The results described above show that in general terms, any map which is reproducible four examples out of five for various random initial orientations can be considered as valid. This is also true for any collection of real DNA molecules with site-specific ligands. We suppose that this criterion can be applied to any valuable alignment algorithm no matter whether the algorithm is orderdependent or not. On the other hand, if most of the molecular, statistical, and experimental parameters are known prior to the experiment, the validity of the mapping can be verified empirically (see eqn 6), and the fraction of incorrectly oriented molecules can also be determined. The procedures described above permit, for instance,

445

the evaluation of the accuracy of determination of affinity constants from the EM mapping data. Moreover, the unexpected cases when the final map depends significantly upon initial orientations of the DNA molecules can be easily detected. In the last examples from the empirical equn (6) it is possible to deduce what parameter(s) of the experiment should be changed to improve the result. We believe that electron microscopy can be fruitfully used for the mapping of long pieces of genomic DNA up to several tens kb, especially on the base of novel approaches (Kurakin et al., 1991; Cherny et al., 1993a,b, 1994; Revet et al., 1993), and for this technique the procedures reported here will be helpful in order to avoid any ambiguity of EM mapping. Acknowledgements--The authors wish to thank Dr A. A. Alexandrov for helpful stimulation of the work. This work was supported by Grant M4J000 from the International Science Foundation and by Grant 11424-a from the Russian Fund for Fundamental Research.

REFERENCES Borovik, A. S., Kalambet, Yu. A., Lyubchenko, Yu. L., Shitov, V. T. and Golovanov, Eu. I., 1980. Equilibrium melting of plasmid ColE1 DNA: an electron-microscopic visualization. Nucl. Acids Res., 8, 4165-4184. Cherny, D. I. and Alexandrov, A. A., 1982. Electron microscopic mapping of E. coli RNA polymerase tight binding sites on the colicin E1 plasmid. Biochem. Int., 5, 399-402. Cherny, D. I., Malkov, V. A., Volodin, A. A. and Frank-Kamenetskii, M. D., 1993a. Electron microscopy visualization of oligonucleotide binding to duplex DNA via triplex formation. J. molec. Biol., 230, 379-383. Cherny, D. I., Belotserkoskii, B. P., Frank-Kamenetskii, M. D., Egholm, M., Buchardt, O., Berg, R. H. and Nielsen, P. E., 1993b. DNA unwinding upon strand displacement binding of thyminesubstituted polyamide to double-stranded DNA. Proc. natn. Acad. Sci. U.S.A., 90, 1667-1670. Cherny, D. I., Kurakin, A. V., Lyamichev, V. N., Frank-Kamenetskii, M. D., Zinkevich, V. E., Firman, K., Egholm, M., Burchardt, O., Berg, R. H. and Nielsen, P. E., 1994. Electron microscopy studies of sequence specific recognition of duplex DNA by different ligands. J. molec. Recogn., in press. Davis, R. W., Simon, M. and Davidson, N., 1971. Electron microscope heteroduplex method for mapping regions of base sequence homology in nucleic acids. Meth. Enzymol., 21D, 413-428. Dubochet, J., Ducommun, M., Zollinger, M. and Kellenberger, E., 1971. A new preparation method.for dark-field electron microscopy of biomarcromolecules. J. Ultrastruct. Res., 35, 147-167. Giacomoni, P. U., Delain, E. and Le Pecq, J. B., 1977. Electron microscopy analysis of the interaction between E. coli DNAdependent RNA polymerase and the replicative form of phage fd DNA. Eur. J. Biochem., 78, 205-213. Hirsh, J. and Schleif, R., 1976. High resolution electron microscopic studies of genetic regulation. J. molec. Biol., 108, 471-490. Johannssen, J., 1988. Interaction of the restriction endonucleases with DNA as revealed by electron microscopy. Meth. Microbil., 20, 325-338. Kadesh, T. R., Williams, R. C. and Chamberlin, M. J., 1980. Electron microscopic studies of the binding of E. coil RNA polymerase to DNA. J. molec. Biol., 136, 79-93. Kramer, H., Niemoller, M., Amouyal, M., Revet, B., Delain, E. and Milgrom, E., 1987. Lac repressor forms loops with linear DNA carrying two suitably spaced lac operators. EMBO J., 6, 1481-1491. Kurakin, A. V., Zaritskaya, L. S., Metliskaya, A. Z., Volodin, A. A. and Cherny, D. I., 1991. Methylase BspR1 as electron microscopical marker for physical mapping of DNA. Micron Microscop Acta., 22, 213-221. Le Cam, E., Theveny, B., Mignotte, B., Revet, B. and Delain, E., 1991. Quantitative electron microscopic analysis of DNA-protein interactions. J. Electr. Micr. Techn., 18, 375-386.

446

A.A. Podtelezhnikov et al.

Lukashin, A. V., Vologodskii, A. V., Frank-Kamenetskii, M. D. and Lyubchenko, Y. L., 1976. Fluctuational opening of the double helix as revealed by theoretical and experimental study of DNA interaction with formaldehyde. J. molec. Biol., 108, 665~682. Lyamichev, V. I., Panyutin, I. G., Cherny, D. I. and Lyubchenko, Yu, L., 1983. Localisation of low-melting regions on phage T7 DNA. Nucl. Acids Res., 11, 2165-2176. Mignotte, B., Delain, E., Rickwood, D. and Barat-Gueride, M., 1988. The Xenopus laevis mitochondrial protein mtDBP-C cooperatively folds the DNA in vitro. E M B O J., 7, 3873 3879. Naumova, G. N., Golovanov, E. I., Cherny, D. I. and Alexandrov, A. A., 1981. Transcription of colicin E 1 plasmid: electron-microscopic mapping of promoters. Molec. Gen. Genet., 181,352 355. Revet, B. M. J., Sena, E. P. and Zarling, D. A., 1993. Homologous DNA targeting with recA-coated short DNA probes and EM mapping on linear duplex molecules. J. molec. Biol., 232, 779 791. Sagitov, V. R. and Alexandrov, A. A., 1988. Selectivity of the DNA methyltransferse BspR1. Dokl. Acad. Nauk USSR, 298, 126t%1268.

Theveny, B., Bailly, B., Rauch, M., Delain, E. and Milgrom, E., 1987a. Association of DNA-bound progesterone receptors. Nature, 329, 79 81. Theveny, B. and Revet, B., 1987b. DNA orientation using specific avidin-ferritin biotin end labeling. Nucl. Acids Res., 15, 947 968. Travers, A. A., 1988. DNA conformation and protein binding. A. Rev. Biochem., 58, 427-452. van Dijken, M. C. and Coetzee, W. F., 1981. Alignment of partially denatured DNA molecules. Biochim. Biophys. Acta., 654, 102-110. Vollenweider, H. J. and Szybalskii, W., 1978. Electron microscopic mapping of RNA polymerase binding to coliphage lambda DNA. J. molec. Biol., 123, 485-498. Williams, R. C. and Chamberlin, M. J., 1977. Electron microscopic studies of transient complexes formed between E. coli RNA polymerase holoenzyme and T7 DNA. Proc. natn. Acad. Sci USA, 74, 3740-3744. Young, I. T., Levinstone, D., Eden, M. Tye, B.-K. and Botstein, D., 1974. Alignment of partial denaturation maps of circularly permutted DNA by computer. J. molec. Biol., 85, 528-532.

Testing the Quality of Electron Microscope Mapping Data for DNA ...

Feb 12, 1994 - Abstraet--A procedure for the testing of Electron Microscope (EM) mapping data for DNA molecules with site-specific bound ligands is suggested. The difficulty of distinguishing DNA molecule ends on electron micrographs indicates that their true orientations are not known. This in turn presents problems in ...

795KB Sizes 0 Downloads 221 Views

Recommend Documents

Testing the Quality of Electron Microscope Mapping ...
Feb 12, 1994 - Abstraet--A procedure for the testing of Electron Microscope (EM) mapping data for DNA molecules with site-specific bound ligands.

transmission electron microscope pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. transmission ...

Presentation - Approaches for quality control testing of LSD vaccines
LSD Vaccine: How much quality information or guarantees do you have now? -> Trust ... -Reproductive Performance: use in pregnant animals. -Dissemination of ...

man-168\female-dna-testing-for-genealogy.pdf
man-168\female-dna-testing-for-genealogy.pdf. man-168\female-dna-testing-for-genealogy.pdf. Open. Extract. Open with. Sign In. Main menu.

Synthesis of Continuous Water Quality Data for the ...
May 2, 2013 - Data analysis and report writing were conducted by Kier Associates and ..... data were acquired from the Yurok Tribe (as a single Excel file), ...

microscope lab.pdf
Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. microscope lab.pdf. microscop

Improving the technique of vitreous cryo-sectioning for cryo-electron ...
Oct 12, 2009 - c Department of Computer Architecture, University of Almeria, Almeria 04120, Spain d Diatome ... attach vitreous cryo-sections to an EM grid support film using electrostatic charging. The ribbon of ...... 13, 365–371. Salje, J.

Research Article A simple and efficient protocol for high quality of DNA ...
high quality DNA with average purity of 1.8 and yield 200 g per gm stem tissue that appropriate for molecular ... Small sample of the subject's DNA which is.

Testing the Quantity-Quality Trade-Off in India
Using data from a representative sample from India, we test the empirical validity of ... 2India's adult illiteracy levels are a big concern, which stands at 39%. ..... missing or unreliable information on any of the variables used in the analysis.

Microscope Coloring.pdf
The stage (I) is the platform that supports the. specimen to be observed. The stage has a hole in its center to allow light to pass through, so. specimens must be ...

Microscope Coloring.pdf
Page 2 of 2. Microscope Coloring.pdf. Microscope Coloring.pdf. Open. Extract. Open with. Sign In. Details. Comments. General Info. Type. Dimensions. Size. Duration. Location. Modified. Created. Opened by me. Sharing. Description. Download Permission.

Foundations of Data Warehouse Quality
the usual database models, such as the Entity-Relationship. Model, the Relational ..... relational table Table1 having three columns, one for the client, one for the ...

Data Quality
databases using a two-digit field to represent years, has been a data quality problem ... leading role, as detailed in Chapter 1; the initiatives include, for instance, the ...... beginning of the 1990's computer scientists begin considering the prob

Download The Art of Software Testing (Business Data ...
the design of effective test cases, including information on psychological and economic principles, managerial aspects, test tools, high order testing, code ...

Different types of data, data quality, available open ...
1. Have data. 2. Magic (?). 3. Something interesting shows up. 4. Profit! “Any sufficiently advanced technology is indistinguishable from magic.” - Arthur C. Clarke ... Types of data. • Structured (databases) vs unstructured (text, image, video

Different types of data, data quality, available open ...
processing tools ... data: Europeana, Digital Public Library of America & The European ... Influential national libraries moving to co-operative open (linked) data.

Download Enterprise Knowledge Management: The Data Quality ...
Kaufmann Series in Data Management Systems) ... Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information · Measuring Data Quality ...

a mobile mapping data warehouse for emerging mobile ...
decade there will be a global population of over one billion mobile imaging handsets - more than double the number of digital still cameras. Furthermore, in ...