Non-permanent presentation
pastor@inka.mssm.edu
ikbe0@cc.uab.es
hweinstein@inka.mssm.edu
TBP binds to DNA in the minor groove of an AT rich sequence (consensus TATA T/A A T/A X), causing unwinding, bending and compression of the major groove without disrupting the hydrogen bonds between base pairs. In the crystal structures of the complexes (PDB codes 1YTB (yeast TBP)1, plant TBP2, 1CDW (human TBP)3, 1TGH (human TBP)4, 1VOL (TBP/TFIIB/DNA)5, 1YTF (TBP/TFIIA/DNA)6), the DNA conformation returns to B-DNA abruptly, immediately outside the eight base paris of contact with TBP. These complexes seem to be an extreme example of induced fit. The large difference in conformation of the bound DNA compared to B-DNA suggests that the energy penalty of the conformational change might be a selectivity determinant. Favorable sequences might have average geometries biased towards forming a complex with TBP; alternatively, but not exclusively, favorable sequences might have weaker stacking energies, and be more amenable to the type of distortions found in the complexes.
The aim of this work is to gain an understanding of the structural characteristics that TBP exploits in its DNA substrates, and to discern whether there are DNA sequences that are predisposed for attaining the particular geometric requirements imposed by TBP binding. In the absence of high resolution structures in solution of DNA oligomers that contain TATA box sequences, we have carried out molecular dynamics simulations and analyzed the conformational properties of seven DNA dodecamers whose sequences (Table 1) include three functional TATA boxes (mlp, 6t and at), two sequences recognized by mutant TBPs7,8 (2c and 7g), a reversed TATA box (r28) and a negative control (gc).
I. Introduction
The TATA box-binding protein (TBP) is a basal transcription factor absolutely required for transcription by the three nuclear RNA polymerases. In the case of genes transcribed by RNA polymerase II, TBP is responsible for promoter recognition and binds directly to DNA in the minor groove. The structural characteristics of the complexes between TBPs and various TATA sequences elucidated recently from X-ray crystallography (PDB) suggest that both direct readout mechanisms and dynamic determinants, such as the ease for deformation, may be functional in the formation of these constructs.
mlp C T A T A A A A G G G C 2c C C A T A A A A G G G C 6t C T A T A T A A G G G C 7g C T A T A A G A G G G C r28 C T T T T A T A G G G C at A T A T A T A T A T A T gc G C G C G C G C G C G C bp step 1 2 3 4 5 6 7 bp 1 2 3 4 5 6 7 8 9 10 11 12
We carried out molecular dynamics simulations using the CHARMM239 potential, with explicit water molecules (TIP3) and sodium ions, using periodic boundary conditions and a spherical cutoff for the nonbonded interactions. To assess the convergence of the results and the independence on the force field used, mlp was simulated also with the AMBER 4.1 10 potential using Ewald sums for the nonbonded interactions, and with the CHARMM23 potential with a different set of initial velocities.
For the CHARMM simulations, the DNA and the sodium ions were solvated in InsightII (Biosym Technologies 1993). The final simulation system, including the DNA dodecamer, 22 Na+ ions and > 3400 TIP3 water molecules, was enclosed in a hexagonal prism of 72Å length with a 24Å side. For the replica of mlp run in AMBER 4.1, the dodecamer and sodiums were solvated in AMBER 4.1 with a 11Å water shell (> 4000 TIP3P water molecules). The final simulation system was enclosed in a square prism (63.1Å X 45.7Å X 44.7Å), whose dimensions were adjusted by running at constant pressure and temperature to ensure the right density.
The molecular dynamics simulations were run with the CHARMM program in the NVE ensemble, using the CHARMM23 all atom potential, the Verlet integrator and periodic boundary conditions. SHAKE was applied to all hydrogen-containing bonds. A cutoff value of 13Å was used, with the shift and switch functions for the electrostatic and van der Waals interactions, respectively. Keeping the DNA and sodium ions fixed, the water was equilibrated for 36 ps. Subsequently, the whole system was energy minimized, and then it was heated from 0 to 300K in 10 ps. Equilibration was carried out for 30 ps with a time step of 2 fs. For the production run of 510 ps (1260 ps for mlp(a)), the time step was reduced to 1.5 fs. A 2D r.m.s.d. plot for mlp(a) is shown in Figure 1. An independent mlp(b) run was started in parallel using a different seed for assigning velocities for the heating of the whole system. Heating (10.5 ps), equilibration (31.5 ps) and production (510 ps) were done with a time step of 1.5 fs, using the Leapfrog Verlet integrator. Structures from the trajectories were saved every 0.075 ps. The molecular dynamics simulation of mlp run in AMBER 4.1 was carried out in the NVT ensemble, using the AMBER 4.1 all atom potential, the Verlet integrator and periodic boundary conditions. SHAKE was applied to all hydrogen-containing bonds. A cutoff value of 9Å was used for the van der Waals interactions, and the electrostatic interactions were treated with the PME algorithm. In this run, water was heated during 15 ps and equlibrated for 85 ps; then, the system was energy minimized, heated from 0 to 300K in 15 ps, and equilibrated for 50 ps. The production run was extended to a total of 1 ns, all the phases carried out with a time step of 2 fs. Structures from this trajectory were saved every 0.1 ps. The comparison of radial distribution functions for Na+-Na+, P-Na+ and P-P, for the mlp runs done with CHARMM and AMBER 4.1, is shown plotted from 0 to 20Å (Figure 2) and amplified around the cutoff distance (Figure 3). There is no apparent accumulation of pairs at the cutoff distance of the electrostatic potential (12Å), suggesting that there are no gross problems with the spherical cutoff and shifting functions used in CHARMM.
DNA conformational analysis was carried out with the CURVES algorithm implemented in Dials and Windows of the MD Toolchest. Since this algorithm performs a global fit to the DNA axis, the reported angles and displacements depend on the DNA length. To allow for a comparison of the local base pair step geometry between the different simulations and the NMR and crystal structures, all the DNA oligomers were disassembled into their constitutive base pair steps. Data were collected for each base pair step, except for those at the ends of the oligomers.
step shift slide rise tilt roll twist ------------------------------------------------------------------------- 1 0.2±0.4 -1.9±0.4 4.9±0.4 -0.6±2.8 40.0±4.0 19.8±2.1 2 -1.3±0.5 -1.5±0.1 3.4±0.2 1.7±1.7 16.5±2.4 16.4±1.1 3 0.2±0.3 1.3±0.1 3.3±0.1 3.4±1.5 7.6±2.5 25.5±1.4 4 0.3±0.3 1.2±0.3 3.4±0.2 -1.8±0.9 26.2±2.7 8.1±5.3 5 -0.4±0.2 2.0±0.2 3.5±0.3 1.0±2.1 25.9±3.6 22.4±2.8 6 0.4±0.2 1.2±0.4 3.3±0.2 0.8±3.8 24.2±3.5 22.7±3.1 7 -0.1±0.6 0.8±0.8 5.4±0.5 1.7±3.7 44.6±5.3 22.5±5.1 -------------------------------------------------------------------------
tetrad shift slide rise tilt roll twist ------------------------------------------------------------------------- charmm 0.2±0.8 -1.2±0.8 3.3±0.5 2.2±6.5 3.1±13.1 32.2±5.6 mlp(a) 0.2±0.7 -1.4±0.8 3.3±0.5 2.1±6.2 2.4±11.9 31.9±5.1 mlp(b) 0.2±0.7 -1.4±0.7 3.3±0.5 2.5±5.9 3.2±12.0 31.9±5.0 amber 0.3±0.6 -1.5±0.6 3.4±0.4 3.0±5.9 2.7± 8.4 30.5±4.5 ------------------------------------------------------------------------- ------------------------------------------------------------------------- NMR 0.7±0.4 -0.9±0.5 3.1±0.3 7.9±3.8 3.9± 7.1 33.2±2.6 NDB-A 0.4±0.6 -1.9±0.4 3.4±0.4 3.3±4.1 6.5± 6.2 30.4±4.4 NDB-B 0.4±0.5 0.1±0.9 3.3±0.2 4.8±3.3 0.6± 6.4 36.2±6.7 ------------------------------------------------------------------------- ------------------------------------------------------------------------- fiber A -0.3 -2.0 3.2 -2.8 10.5 30.7 fiber B 0.0 -0.6 3.3 0.4 -2.6 35.9 -------------------------------------------------------------------------
Following Olson and coworkers11, we define for further use the "thermally accessible range of conformations for general sequence DNA" as the interval included between the mean ± one standard deviation in the charmm entry of Table 3.
source shift slide rise tilt roll twist ---------------------------------------------------------------- aAAa 0.1 -1.4 3.2 2.8 6.1 31.4 aAAg 0.2 -1.4 3.2 1.6 -2.1 32.7 tAAa 0.3 -1.3 3.3 2.7 5.5 31.6 tAAg 0.1 -1.3 2.9 -3.2 10.0 28.4 ---------------------------------------------------------------- aAGa -0.4 -1.1 3.7 0.7 -8.8 34.2 aAGg 0.0 -1.7 3.3 0.6 2.2 31.7 gAGg -0.4 -0.8 3.3 6.2 34.0 tAGg -0.5 -0.8 3.4 1.8 34.7 ---------------------------------------------------------------- aGAg 0.0 -1.2 3.0 0.6 7.5 30.1 ---------------------------------------------------------------- aGGg 0.2 3.5 3.0 2.1 32.9 gGGc 0.4 3.3 4.7 -6.8 32.7 ---------------------------------------------------------------- ---------------------------------------------------------------- cCAt -1.2 3.6 4.3 -1.7 32.7 ---------------------------------------------------------------- gCGc 0.3 -0.5 1.8 -4.8 35.1 ---------------------------------------------------------------- aTAa 0.0 -0.8 3.8 2.7 2.1 33.9 aTAt 0.8 -0.8 3.5 5.5 -0.7 34.4 cTAt 0.6 -1.0 3.5 3.8 0.7 33.5 ---------------------------------------------------------------- ---------------------------------------------------------------- cATa -0.4 -0.9 -3.1 tATa 0.1 -1.0 3.0 1.9 9.2 30.7 ---------------------------------------------------------------- cGCg 0.0 -0.7 0.1 27.4 ----------------------------------------------------------------
The Table identifies variations in base pair step geometry that are sequence dependent and are likely to be relevant for TBP binding: YR steps display the highest rise, while RY steps display the lowest rise and twist, and highest positive roll.
The table was queried for particular tetrads that were the most or the least likely to acquire the conformation found in the crystal complexes with TBP. We used the confidence intervals as a filter in evaluating all the conformations generated in the simulations, to count the number of times that a particular geometrical parameter for each tetrad fell inside the confidence intervals. The frequency with which each tetrad visits the crystal conformations was further rated for significance with a chi squared test. The procedure resulted in the identification of tetrads that appeared more times in the range of properties corresponding to the crystal structures (best tetrad in Table 5) and the ones that made it the least number of times in this category (worst tetrad in Table 5).
bp parameter best worst / bp step tetrad tetrad ------------------------------------------- rise 1 aTAa tata roll 1 cgcg gggc twist 1 cata gcgc ------------------------------------------- shift 2 tagg atat twist 2 cATa aggg ------------------------------------------- slide 3 aTAt aggg ------------------------------------------- slide 4 atat aggg roll 4 cgcg gggc twist 4 atat aggg ------------------------------------------- slide 5 aTAt tata roll 5 cata gggc twist 5 cata gcgc ------------------------------------------- slide 6 atat tATa roll 6 cATa gggc twist 6 cgcg gcgc ------------------------------------------- slide 7 gcgc aggg rise 7 aTAa tata roll 7 cgcg gggc -------------------------------------------
Entries highlighted in red correspond to those that identified as the best tetrad a step that actually has been found in a crystal with TBP, and hence are base pair step properties that are very likely to be used as selectivity determinants.
The entry highlighted in purple corresponds to a property that identified as the worst tetrad a step that has been crystallized with TBP; consequently, such a step property cannot be a selectivity determinant.
We failed to find any difference between the average conformation of a dodecamer with a sequence that is not recognized by TBP (gc) and those of known TBP binding sites (mlp, 6t, at). We next asked whether our simulations were capable of displaying sequence dependent variations in DNA local structure (Table 4), and found differences between YR and RY steps, especially in rise, roll and twist.
Comparing the thermally accessible range of conformations of general sequence DNA (Table 3) with the range of conformations consistent with complexation with TBP (Table 2) we selected those geometrical parameters specific to the bound DNA that are outside the thermal range. These are candidates for being selectivity determinants. To further narrow down the list of candidates, we probed all the conformations generated during the production phase of the simulations for their ability to reach the conformations found in TBP-bound DNA, and graded each tetrad according to the frequency of visits in these ranges. Those properties that selected a step that has been found in a crystal with TBP are very likely to be selectivity determinants (Table 5, red entries). Conversely, the property that selected as worst tetrad a step that has been found in a complex with TBP cannot be a selectivity determinant.
Best sequences for TBP binding have been identified as: alternating YR sequences, because YR steps have the highest rise (for the kink sites at steps 1 and 7), and RY steps have the low twist and high positive roll needed throughout the recognition element (Table 2 and Table 4).
The specific structural and dynamic characteristics that predispose a DNA sequence for selective interaction with TBP were shown to be identifiable at the level of base pair steps. They show clear departures from the properties of general sequence DNA. The underlying molecular interactions that produce the special properties of these steps are the subject of continuing investigations.
1. Kim, Y., Geiger, J.H., Hahn, S. and Sigler, P.B. Crystal structure of a yeast TBP/TATA-box complex. Nature 1993, 365, 512-520.
2. Kim,J.L. and Burley, S.K. 1.9Å resolution refined structure of TBP recognizing the minor groove of TATAAAAG. Nature Structural Biology 1994, 1, 638-652.
3. Nikolov, D.B., Chen, H., Halay, E.D., Hoffmann, A., Roeder, R.G. and Burley, S.K. Crystal structure of a human TATA box-binding protein/TATA element complex. Proc. Natl. Acad. Sci. U.S.A. 1996, 93, 4862-4867.
4. Juo, Z.S., Chui, T.K., Leiberman, P.M., Baikalov, I., Berk, A.J. and Dickerson, R.E. How proteins recognize the TATA box. J. Mol. Biol. 1996, 261, 239-254.
5. Nikolov, D.B., Chen, H., Halay, E.D., Usheva, A.A., Hisatake, K., Lee, D.K., Roeder, R.G. and Burley, S.K. Crystal structure of a TFIIB-TBP-TATA-element ternary complex. Nature 1995, 377, 119-128.
6. Tan, S., Hunziker, Y., Sargent, D.F. and Richmond, T.J. Crystal structure of a yeast TFIIA/TBP/DNA complex. Nature 1996, 381, 127-134.
7. Arndt, K.M., Ricupero, S.L., Eisenmann, D.M. and Winston, F. Biochemical and genetic characterization of a yeast TFIID mutant that alters transcription in vivo and DNA binding in vitro. Mol. Cell. Biol. 1992, 12, 2372-2382.
8. Arndt, K.M., Wobbe, C.R., Ricupero-Hovasse, S., Struhl, K. and Winston, F. Equivalent mutations in the two repeats of yeast TATA-binding protein confer distinct TATA recognition specificities. Mol. Cell. Biol. 1994, 14, 3719-3728.
9. MacKerell Jr., A.D., Wiórkiewicz-Kuczera, J. and Karplus, M. An all-atom empirical energy function for the simulation of nucleic acids. J. Am. Chem. Soc. 1995, 117, 11946-11975.
10. Cornell, W.D., Cieplak, P., Bayly, C.L., Gould, I.R., Merz Jr., K.M., Ferguson, D.M., Spellmeyer, D.C., Fox, T., Caldwell, J.W. and Kollman, P.A. A second generation force field for the simulation of proteins, nucleic acids and organic molecules. J. Am. Chem. Soc. 1995, 117, 5179-5197.
11. Olson, W.K. Simulating DNA at low resolution. Curr. Opinion in Struct. Biol. 1996, 6, 242-256.