amber

更新时间:2023-10-24 17:07:01 阅读量: 综合文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

(Note: These tutorials教程 are meant to provide illustrative 有说服力的examples of how to use the AMBER software suite一套 to carry out simulations that can be run on a simple workstation in a reasonable合理的 period of time. They do not necessarily provide the optimal最佳的 choice of parameters or methods for the particular application area.) Copyright Ross Walker 2010 TUTORIAL B1

Simulating 模拟a DNA polyA-polyT Decamer十倍体 (Updated for Amber 10 & AmberTools 1.2) By Ross Walker Pictured above is the average structure from a 1 nanosecond纳秒 molecular dynamics simulation of a 10 base pair poly(A)-polt(T) DNA duplex二体. The calculation was run in explicit显示的 solvent溶剂 using periodic 周期的boundaries边界 and the particle mesh Ewald method of treating long range electrostatics静电. The average structure was generated using ptraj by RMS fitting all of the DNA atoms in 1,000 snapshots快照 at 1 ps intervals间隔 and then averaging the coordinates坐标. 1) Introduction

The purpose of this tutorial is to provide an initial最初的 introduction to setting up and running simulations using the AMBER software (It is based on AMBER 10 and AmberTools 1.2 but should work reasonably well with AMBER 9 and with newer versions 版本of the AMBER software as they are released释放发布). In this tutorial we run a series of simulations on a poly(A)-poly(T) decamer of DNA. We will first figure out how to generate a starting structure and then use this structure to construct the necessary input files for running sander, the main molecular dynamics engine supplied with AMBER 10.

In order to run a classical molecular dynamics simulation with Sander a number of files文件文档 are required. These are (using their default默认 filenames):

prmtop - a file containing a description of the molecular topology and the necessary force field parameters.

分子力场有时被称为势函数,力场参数就是经验参数。以下是一般分子力场势函数包括的几个部分:

描述分子内成键作用的项

键伸缩能:构成分子的各个化学键在键轴方向上的伸缩运动所引起的能量变化 键角弯曲能:键角变化引起的分子能量变化

二面角扭曲能:单键旋转引起分子骨架扭曲所产生的能量变化 交叉能量项:上述作用之间耦合引起的能量变化 描述分子间作用的项

非键相互作用:包括范德华力、静电相互作用等与能量有关的非键相互作用 橡皮几何学-拓扑学

莫比乌斯带是一种拓扑图形,什么是拓扑呢?拓扑所研究的是几何图形的一些性质,它们在图形被弯曲、拉大、缩小或任意的变形下保持不变,只要在变形过程中不使原来不同的点重合为同一个点,又不产生新点。换句话说,这种变换的条件是:在原来图形的点与变换了图形的点之间存在着一一对应的关系,并且邻近的点还是邻近的点。这样的变换叫做拓扑变换。拓扑有一个形象说法——橡皮几何学。因为如果图形都是用橡皮做成的,就能把许多图形进行拓扑变换。例如一个橡皮圈能变形成一个圆圈或一个方圈。但是一个橡皮圈不能由拓扑变换成为一个阿拉伯数字8。因为不把圈上的两个点重合在一起,圈就不会变成8,“莫比乌斯带”正好满足了上述要求。右下角是三角形麦比乌斯带,左端绿色与右端黄色相连,扭曲的三角形莫比乌斯带可以不断循环:绿--黄---红---绿---黄---…。拓扑变换的不变性、不变量还有很多,这里不再介绍。

inpcrd (or a restrt from a previous run) - a file containing a description of the atom coordinates and optionally velocities随意速度 and current periodic box dimensions大小.

mdin - the sander input file consisting of a series of namelists and control variables that determine the options and type of simulation to be run.

In the first section of this tutorial we shall use the tools provided with AmberTools 1.2 to create prmtop and inpcrd files for both in vacuo 真空and solvated溶剂化物 systems. We will then run sander to perform minimisation followed by molecular dynamics and eventually get to the point where we can reproduce the picture shown above.

Since running these simulations using explicit显性的 solvent can be expensive, we will also use some models that include solvent溶剂 effects implicitly.隐性的 The approximate大致的 order of this tutorial will be as follows:

Create the prmtop and inpcrd files: This is a description of how to generate the initial structure and set up the molecular topology/parameter and coordinate files necessary for performing minimisation or dynamics with sander.

An introduction to minimisation and molecular dynamics. Run short MD simulations in-vacuo. Perform basic analysis such as calculating root-mean-squared deviations (RMSd) and plotting various energy terms as a function of time. Visualising 可视化results with VMD.

Minimisation and molecular dynamics in implicit solvent: Setting up and running equilibration and production minimisation and molecular dynamics simulations for our DNA model using the Born implicit solvent model.

Minimisation and molecular dynamics in explicit solvent: Setting up and running equilibration and production simulations for our DNA model using TIP3P explicit water.

Throughout this tutorial filenames and command line switches开关 will be written in courier or an equivalent monospace font while program names such as sander will be written in the same font but italicised. Input files will be coloured in red and output files in green.

2) Setting up duplex DNA: polyA-polyT

The first stage舞台,阶段 of this tutorial is to create the necessary input files required by sander for performing minimization and molecular dynamics.

There are a number of methods for creating these input files. In this example we will use a program written specifically for this purpose: LEaP. This is a program that reads in force field, topology and coordinate information and produces the files necessary for production calculations (i.e. minimization, molecular dynamics, analysis, ...). There are two versions of this program provided with AMBER 10 & AmberTools 1.2. A graphical version called xleap and a terminal 终端interface接口 called tleap. (There are also completely new versions being produced called gleap and sleap respectively分别 that will ultimately replace xleap and tleap but discussion of this is beyond the scope of this tutorial.) Since we want to \our models we will use xleap in this tutorial.

The approximate大概 ordering of this section is as follows:

Where do I get the coordinates?

What representation should I use and what should I simulate? A discussion of issues to consider before starting...

Building the prmtop and inpcrd files... This will be done using xleap.

2.1) Generating the coordinates of the model structure

The first step in any modeling project is developing the initial model structure. Although in principle, one could use xleap to build a model structure by hand this is only practical for the smallest of systems. The difficulty in both manipulating操作 and predicting the structure of large biomolecular systems means that building a structure by hand is not usually a sensible undertaking事业企业任务. Instead, experimentally determined structures are used. These can be found by searching through databases of crystal or NMR structures such as the Protein Data Bank or the Cambridge Structural Database. With nucleic acids, users can also search the Nucleic Acid Database.

When experimental structures are not available all hope is not lost since there are a variety of programs that facilitate帮助 building model structures using homology modeling and predictive techniques; the list of possible sources is beyond the scope of this tutorial. However, it is worth mentioning that for nucleic acid structure prediction Dave Case and Tom Macke, formerly of The Scripps Research Institute have

developed the NAB molecular manipulation操作 language which facilitates the building of complex nucleic acid structures. NAB is now included as part of the AmberTools package (see http://www.ambermd.org/ for more info). Numerous methods also exist for predicting the structure of proteins but in general such structure prediction is still in its infancy婴儿初期. Thus a good experimental structure is typically preferred. If, however, a predicted protein starting structure is all that is available it should be noted that these typically require more elaborate详细的 minimization and equilibration procedures程序 prior to production dynamics simulations than do structures found by experimental methods.

Hurdle #1: Note: as pointed out in the later tutorials, sometimes dealing with Brookhaven PDB files (as provided by the Protein Data Bank) can be rather tricky 棘手的due to variations in naming conventions习惯会议. The naming, and often formatting 规定issues争议, that \be overcome in order to perform detailed molecular simulation. Generally within AMBER, if you are using standard amino acid or nucleic acid residues, at most all that is necessary is some slight massaging of the PDB format and atom names. If, however, you wish to simulate complex carbohydrates, lipids or non-standard protein residues then you may be required to develop parameters and topology information. This, however, is a more advanced issue that will be deferred until later tutorials.

2.1.1) Creating our DNA duplex using NAB

Pictured above are models of the decamer poly(A)-poly(T) shown end-on (top) and with a view into the minor (top half of the molecule) and major (bottom half of the molecule) groove (bottom). On the left is canonical A-DNA, the middle is the average structure previously discussed, and on the right is canonical B-DNA. The method for creating the canonical structures is discussed below.

Since we don't have an experimentally determined structure for our 10-mer DNA duplex, we are going to have to create one from scratch. A good resource is w3dna.rutgers.edu, which allows one to create a wide variety of nucleic acid structures. Here, we will use the fd_helix() routine in NAB (part of AmberTools)

A note on force fields

A number of different force fields are supplied with AMBER. In AMBER v5.0 and v6.0 the default field was the Cornell et al. (1995) or parm94.dat force field (referred to as FF94 in AMBER v8.0 and later). The AMBER v10.0 force fields recommended for the simulation of proteins and nucleic acids in explicit solvent are either the FF99SB or FF03 force fields which contain several improvements over the FF94 force field. The most notable changes are new partial charges for proteins based on DFT quantum量子 calculations in continuum solvent (FF03), as well as updated

torsion扭力 terms for Phi-Psi angles (FF99SB) which improve the over estimation of alpha helices螺旋 that occurs when using the FF94, FF96 and FF99 force fields. It should be noted, however, that the FF99SB/FF03 force fields do not introduce any new changes for nucleic acids. The charges are still based on HF gas phase ab initio quantum calculations and the bond angle and dihedral二面角 parameters are the same as the FF99 force field hence FF99SB and FF99 can be considered equivalent in this context. In this tutorial we will be using the FF99SB all-atom force field. More details on the available force fields can be found in the AMBER manual手册 and the papers referenced within.

With the FF99SB force field, phosphates are considered to be part of the nucleotide (as is standard). Because the terminal groups of DNA are different (i.e. 3'- and 5'- terminal residues), the names have to be different to associate the appropriate适当的 topology (i.e. list of bonds, atoms, etc.) and parameters. For DNA, the residue names used are DA5, DA and DA3 for a 5' terminal adenine, a non terminal adenine, or a 3' terminal adenine respectively. [A single isolated adenine nucleotide is DAN.]

So, with this in mind we can construct the input file required by NAB to build our 10-mer polyA-polyT DNA duplex in the Arnott B-DNA canonical structure. This program is given below. Basically, this is building two strands of A-T paired DNA. For more specific information about the various options, see the manual.)

molecule m;

m = fd_helix( \putpdb( \

Put the above into a plain-text file called nuc.nab.

NOTE: Before we run any of the programs provided with AMBER, we need to make sure the Unix shell environment variable that specifies where AMBER is installed is set properly. This is necessary for xleap and lets the system know where the executables are located ($AMBERHOME/exe).

If you are running bash, csh or tcsh check the AMBERHOME environment variable is set by typing:

echo $AMBERHOME

If you see:

AMBERHOME: Undefined variable. (csh or tcsh)

or simply a blank line (bash) then the AMBERHOME environment variable was not set properly and you need to initialize初始化 it to point to the AMBER installation 安装directory目录. If AMBER is installed in /usr/local/amber10 then you would

type:

setenv AMBERHOME /usr/local/amber10 (csh or tcsh) or

export AMBERHOME=/usr/local/amber10 (bash)

While you are at it, you may want to update your .cshrc (csh), or .bashrc (bash) file to define this variable and add the AMBER binary directory to your path (or list of directories searched for executables) at login. In your .bash_login file [or the global /etc/bashrc file] (if using the bash shell), add the following (substituting the correct path to AMBER):

export AMBERHOME=/usr/local/amber10 export PATH=$PATH:$AMBERHOME/exe

For further details on installation please refer to the AMBER manual.

Running NAB

To run NAB, you just type this:

nab nuc.nab ./a.out

This should produce a nuc.pdb file, which is the model structure of our DNA duplex. It contains the Cartesian coordinates of all of the atoms in the duplex in locations determined from fibre-diffraction data. The file should consist of 640 lines with a single line for each atom, 638 atoms in total as well as two lines containing the word TER, one after the first 10mer chain and one at the end:

LINE DEFINITION ATOM NO. ATOM NAME RESIDUE NAME RESIDUE NO. X Y Y

ATOM 1 HO5' DA 1 -0.423 -8.150 -2.094 ATOM 2 O5' DA 1 0.427 -7.826 -1.788 ... ... ... ... ... ... ... ...

2.1.2) Loading the structure into Leap

The next step is to take a look at the model structure. It is always a good idea to look at the models before trying to use them. In this way problems can often be identified before running expensive calculations. xleap works fine for displaying models assuming假设 that the appropriate residue definition files are loaded into xleap and the residue names in the pdb file are consistent with what xleap expects. Alternatively或者 a range of freely available and commercial商业的 packages exist for viewing

pdb files. A very good program, although not the only choice, is VMD (http://www.ks.uiuc.edu/Research/vmd/) which is freely available for academic research. For the moment we will stick to 坚持using xleap since this is what we will be using to create the input files for our simulations. Other methods of visualization will be covered in later sections of this tutorial.

In addition to the residue names xleap also requires information about the chain connectivity. Residues can be connected to other residues from both sides, only one side or not at all. In xleap lingo, the residue can have a \

head and tail: the residue has connectivity from both sides such as an internal内部 residue in a protein or nucleic acid. Examples are alanine \tail only: the residue is a terminal 端点residue at the start of a chain. Examples are the N-terminal residues of proteins \\

head only: the residue is a terminal residue at the end of a chain. Examples are the C-terminal residues of proteins \or the 3'-terminal residue of nucleic acids \

no head or tail: the residue doesn't connect to other residues. Examples are an isolated nucleotide \

The various different residues and their names are defined in library files that xleap loads on startup. More on this later. For the moment it is sufficient to say that the names used for the residues in the pdb files must match those defined in the default xleap library files or in user defined library files.

xleap is fairly simple in that it expects that all residues in the pdb file are connected in the order in which they are listed unless they are separated by a \card (i.e. a single line that says \residue is \important to remember this when initially creating the input files for your simulation. In our case NAB automatically自动地 added the TER cards for us.

Let's take a look at our DNA model. The first step is to start up the graphical version 版本of LEaP:

$AMBERHOME/exe/xleap -s -f $AMBERHOME/dat/leap/cmd/leaprc.ff99SB

This should load xleap and you should see something like this appear:

A quick note on the command line used to start xleap: The command line shown above contains a couple of options which are worthy of comment 评论at this point. When xleap loads it initially has to open a series of library and parameter files that

define the force field parameters to be used and the residue maps etc. Since AmberTools ships with a range of different force fields, each suited to different types of simulation, it is important to tell xleap which force field we wish to use. This is what the command line options used above are for. The \switch tells xleap to ignore any user defined defaults which might otherwise override our selection, while the \执行the start-up开始阶段的 script脚本 for the FF99SB force field. This script contains commands that cause xleap to load all of the configuration files required for the AMBER FF99SB force field. If you look in the $AMBERHOME/dat/leap/cmd/ directory you will find a number of different leaprc files such as leaprc.ff03 (FF03 force field), leaprc.ff02ep (FF02 Polarizable force field with lone pairs) etc.

XLEAP MENU BUG: Note if you find that the menu's in xleap are not working please check that your numlock light is turned off. For some reason having numlock on prevents the xleap menus from operating correctly.

To load a pdb into xleap we can use the loadpdb command. This will create a new unit in xleap and load the specified pdb into that unit. We can then subsequently 随后view and edit the new unit using the edit command.

So, to load the pdb file we just created into a new unit called \type the following in the xleap window (make sure the xleap window is highlighted 突出and your mouse cursor光标 is within the window):

dna1=loadpdb \

This should result in output, similar to the following, appearing in the xleap window:

> dna1=loadpdb \Loading PDB file: ./nuc.pdb total atoms in file: 638 >

NOTE: If you receive an error about unrecognized atoms then you are likely using a version of AMBER prior to 10.0 or version 10.0 but without the latest bugfixes. You can either update your version of AMBER or use the following pdb file instead. (nuc_old.pdb)

To look at the structure, in xleap, use the edit command and specify the unit you wish to edit, in this case the one we just created \

》edit dna1

This should open the xleap editor window and display展示 the molecule.

Within this window the LEFT mouse button allows atom selection (drag and drop), the MIDDLE mouse button rotates旋转 the molecule and the RIGHT mouse button translates跳动 it. If the MIDDLE button does not work you can simulate it by holding down the Ctrl key and the LEFT mouse button. To make the image larger or smaller, simultaneously hold down the MIDDLE and RIGHT buttons (or Ctrl and RIGHT) and move the mouse up to zoom in or down to zoom out.

If you played with selecting atoms using the left mouse button you can unselect a region by holding down the shift key while drawing the selection rectangle. To select everything, double click the LEFT button (and to unselect, do the same while holding down the shift key).

Take a look at the structure. You should be able to see the perfect symmetry of canonical B-form geometry DNA. The perfect symmetry of canonical duplexes is based on analysis of long fibers of DNA. Real nucleic acids don't necessarily adopt this perfect symmetry as will become apparent once we start to carry out molecular dynamics on this 10-mer.

2.2) What level of simulation am I going to attempt试图?

Once you have got a suitable model structure the next step is to decide what level of simulation realism is to be used. The complexity复杂度 of the calculation centers on the evaluation of the pairwise nonbonded and Coulombic interactions. Extra complexity is introduced by using periodic boundaries and Ewald methods to treat long ranged electrostatics and/or by evaluating non-additive effects such as induced polarization.

Water is an integral完整的 part of nucleic acid structure and thus some representation of solvent effects is fairly critical关键. Simulations in vacuo have been performed where the screening of solvent is modeled by distance dependent or sigmoidal dielectric functions (the latter of which is not implemented in AMBER 10.0). Additionally tricks have been applied to keep the base pairs from fraying, through the addition of Watson-Crick base pair restraints and the reduction of the charges on the phosphate groups. Newer versions of AMBER (6.0 and above) contain the generalized Born model for implicit solvation which, although more expensive, provides a much better implicit solvent representation than simply using a distance dependent dielectric constant.

However, even with advances in computer power and methodological improvements, such as application of Ewald methods, which allow routine simulations of nucleic acids with explicit solvent and counterions in the nanosecond time range, there is still dependence of the results on the molecular mechanical force field. It is therefore important to understand the inadequacies of the force field being used.

Now, back to the point of this discussion, if you can afford it, include the solvent and the explicit net-neutralizing counterions. Also pay attention to the force field applied and be aware of its limitations. Use methods which properly treat long range electrostatic interactions, such as Ewald methods. However, remember that adding explicit water is expensive. While a nanosecond or so of in vacuo DNA simulation can take only minutes on a 3GHz P4, adding a periodic box of water that surrounds the DNA by roughly大概 10 angstroms, extends 扩大the simulation to several days. Given that it is normally necessary to run the simulation a couple of times (due to errors, sampling issues, etc.), these simulations can get very costly.

2.2.1) The types of simulations to be run in this tutorial

For the purpose of this tutorial we will build 3 different DNA models. The first will be an in vacuo model of the poly(A)-poly(T) structure (named polyAT_vac), an in vacuo model of the poly(A)-poly(T) structure with explicit counterions (named polyAT_cio), and a TIP3P (water) solvated model of the poly(A)-poly(T) structure in a periodic box (named polyAT_wat). The in vacuo model will be applied in simulations to get a feel for MD and then the solvated model will be used for periodic boundary simulations using a particle mesh Ewald treatment. The in vacuo model with the explicit ions will not be used for simulation but it is a good idea to build it in case it is needed for later analysis.

In order simplify post simulation analysis of the trajectories轨迹 it is useful to have all three sets of prmtop files. This is because often in the analysis of the trajectory displaying the solvent is not normally necessary and the visualization packages will run much faster if the solvent is removed from the trajectory file before loading. Obviously the water is necessary for calculating radial distribution functions, analyzing water structure, and other properties, however it isn't necessary for calculating helicoidal parameters, determining average structures, etc. Therefore, to minimize disk space usage, and speed the analysis, we often strip除去 the water and/or counterions. The three separate prmtop files are useful to have around since you need to use a prmtop that matches the structure of your (possibly stripped) trajectory for programs such as ptraj, rdparm, VMD etc.

2.3) Building the prmtop and inpcrd files

Now that we have a starting pdb (nuc.pdb) and an understanding of some of the issues surrounding different types of classical MD simulations we are ready to start building the input files necessary for the MD engine in AMBER 10.0, sander.

The first step is the building of residues. Many proteins contain coenzymes as well as

standard amino acids. These coenzymes are not normally pre-defined in the AMBER database and so are considered to be non-standard residues. It is necessary to provide structural information and force field parameters for all of the non-standard residues that will be present in your simulation before you can create the sander input files. Fortunately, if you are using standard nucleic acid or amino acid residues, as we will be in this tutorial, this step is not necessary since all of the residues are pre-built in the AMBER database. Later tutorials will cover what to do if you have non-standard residues.

2.3.1) LEaP

If xleap is not already running, start it up.

$AMBERHOME/exe/xleap -s -f $AMBERHOME/dat/leap/cmd/leaprc.ff99SB

In so doing, you should have seen that the appropriate libraries were loaded, as specified by the -f flag标志旗帜.

To see if you have the residues we need for this example, you can check to see if the residues have been properly loaded by trying to edit the unit DA5, i.e. in xleap:

edit DA5

If you see something similar to the following, i.e. the unit is not blank, then the residues were loaded correctly.

Close the window by selecting Unit -> Close from the top menu [Do not use the X button as it will quit all of XLEaP]. (If the menu's don't work turn off NumLock). You can also list all the defined residues in LEaP by typing \You should see the following:

> list

ACE ALA ARG ASH ASN ASP CALA CARG

CASN CASP CCYS CCYX CGLN CGLU CGLY CHCL3BOX

CHID CHIE CHIP CHIS CILE CIO CLEU CLYS CMET CPHE CPRO CSER CTHR CTRP CTYR CVAL CYM CYS CYX Cl- Cs+ DA DA3 DA5 DAN DC DC3 DC4 DC5 DCN DG DG3 DG5 DGN DT DT3 DT5 DTN GLH GLN GLU GLY HID HIE HIP HIS HOH IB ILE K+ LEU LYN LYS Li+ MEOHBOX MET

MG2 NALA NARG NASN NASP NCYS NCYX NGLN NGLU NGLY NHE NHID NHIE NHIP NHIS NILE NLEU NLYS NMABOX NME NMET NPHE NPRO NSER NTHR NTRP NTYR NVAL Na+ PHE PL3 POL3BOX PRO RA RA3 RA5 RAN RC RC3 RC5 RCN RG RG3 RG5 RGN RU RU3 RU5

RUN Rb+ SER SPC SPCBOX THR TIP3PBOXTIP4PBOX TP3 TP4 TP5 TRP TYR VAL WAT parm99 >

Note: for a list of available xleap commands type \For help on a specific command type \should get the following on typing \

> help loadpdb

variable = loadPdb filename STRING _filename_

Load a Protein Databank format file with the file name _filename_. The sequence numbers of the RESIDUEs will be determined from the order of residues within the PDB file ATOM records. For each residue in the PDB file, LEaP searches the variables currently defined for variable names that match the residue name. If a match is found, then the contents of the variable are copied into the UNIT created for the PDB structure. If no PDB `TER' card separates the current residue from the previous one, a bond is created between the connect1 ATOM of the previous residue and the connect0 atom of the new one. As atoms are read from the ATOM records, their coordinates are written into the correspondingly相应的 named ATOMs within the residue being built. If the entire residue is read and it is found that ATOM coordinates are missing, then external coordinates are built from the internal coordinates that were defined in the matching UNIT (residue) variable. This allows LEaP to build coordinates for hydrogens and lone pairs which are not specified in PDB files.

Now, let's go back to the beginning and assume everything was set up properly; the distraction above was simply to give you a little insight to what goes on behind the scenes... Remember, using software like it is a black box is dangerous, especially in research.

Following this rather long aside, we can now return to where we were before. Remember that we had loaded our model:

dna1 = loadpdb \

To look at the newly created \

edit dna1

This should cause the unit editor window to pop up as shown below:

To create our first prmtop and inpcrd files, we simply issue问题发出 the following command in the main xleap window:

saveamberparm dna1 polyAT_vac.prmtop polyAT_vac.inpcrd

This should give you the following output (the warning concerns the fact that we did not neutralize our system - more on this later):

> saveamberparm dna1 polyAT_vac.prmtop polyAT_vac.inpcrd Checking Unit.

WARNING: The unperturbed charge of the unit: -18.000000 is not zero.

-- ignoring the warning.

Building topology.

Building atom parameters. Building bond parameters. Building angle parameters.

Building proper torsion parameters. Building improper torsion parameters. total 110 improper torsions applied Building H-Bond parameters.

Not Marking per-residue atom chain types. Marking per-residue atom chain types. (no restraints)

and create the following two files:

polyAT_vac.prmtop polyAT_vac.inpcrd

To remind you about these files:

prmtop: The parameter/topology file. This defines the connectivity and parameters for our current model. This information is static, or in other words, it doesn't change during the simulation. The prmtop we created above is called polyAT_vac.prmtop. inpcrd: The coordinates (and optionally box coordinates and velocities). This is data is not static and changes during the simulations (although the file is unaltered). Above we created an initial set of coordinates called polyAT_vac.inpcrd.

Now we want to create a topology that has explicit net neutralizing counterions. There

are a number of different ways to add ions to a structure. In this example we shall use the addions command implemented in xleap. This method works by constructing a Coulombic potential on a 1.0 angstrom grid and then placing counterions one at a time at the points of lowest/highest electrostatic potential. The command to do this is as follows (the '0' means 'neutralize'):

addions dna1 Na+ 0

This should add a total of 18 sodium anions to counteract the -18 charge of the DNA chain. (Note: The command works on integers and so if your system charge is -17.999 then it will only add 17 counterions. In this case the simple solution is to explicitly specify the number of counter ions you need in place of the zero.)

The output from this command should be similar to the following:

> addions dna1 Na+ 0

18 Na+ ions required to neutralize.

Adding 18 counter ions to \Grid extends from solute vdw + 1.87 to 7.97 Resolution: 1.00 Angstrom. grid build: 0 sec (no solvent present) Calculating grid charges charges: 0 sec

Placed Na+ in dna1 at (-0.73, 10.83, 18.51). Placed Na+ in dna1 at (6.27, -8.17, 18.51). Placed Na+ in dna1 at (10.27, 4.83, 11.51). Placed Na+ in dna1 at (-6.73, -8.17, 11.51). Placed Na+ in dna1 at (-5.73, 3.83, 13.51). Placed Na+ in dna1 at (-9.73, 3.83, 25.51). Placed Na+ in dna1 at (6.27, -8.17, 5.51). Placed Na+ in dna1 at (-5.73, 8.83, 1.51). Placed Na+ in dna1 at (11.27, 1.83, 25.51). Placed Na+ in dna1 at (-10.73, -4.17, 28.51). Placed Na+ in dna1 at (5.27, 3.83, 3.51). Placed Na+ in dna1 at (2.27, -6.17, 27.51). Placed Na+ in dna1 at (-10.73, -3.17, 8.51). Placed Na+ in dna1 at (6.27, 7.83, 15.51). Placed Na+ in dna1 at (10.27, -3.17, 8.51). Placed Na+ in dna1 at (-1.73, -12.17, 16.51). Placed Na+ in dna1 at (-9.73, 3.83, 4.51). Placed Na+ in dna1 at (-7.73, 8.83, 21.51).

Done adding ions.

Note: you should always check that the number of ions you were expecting have actually been added. It is also a good idea to view the new structure to ensure that the charges have been placed as intended. By editing the \we can see where the ions have been added:

edit dna1

Now we are once again ready to write the prmtop and inpcrd files, this time for our neutralized system:

saveamberparm dna1 polyAT_cio.prmtop polyAT_cio.inpcrd

Output files: polyAT_cio.prmtop, polyAT_cio.inpcrd

The final input files to create are for solvated DNA with explicit counterions. We have our %unit already built with counterions so the next step is to solvate it with explicit water. This is done with the command \put an 8 angstrom buffer of TIP3P water around the DNA in each direction. In this way all atoms in the DNA starting structure will be no less than 8 angstroms from the edge of the water box. Before we do this, however, for reasons that will become clear later we should create a copy of our dna1 and call it dna2:

dna2 = copy dna1

The following creates a rectangular box of water around the DNA:

solvatebox dna1 TIP3PBOX 8.0

This results in the following output (exact numbers may be slightly different due to round off differences between different computer architectures), editing the \(edit dna1) should show you the DNA in a water box:

> solvatebox dna1 TIP3PBOX 8.0

Solute vdw bounding box: 25.736 26.736 40.240 Total bounding box for atom centers: 41.736 42.736 56.240 Solvent unit box: 18.774 18.774 18.774

Total vdw box size: 44.537 45.963 58.910 angstroms. Volume: 120593.276 A^3

Total mass 53972.060 amu, Density 0.743 g/cc Added 2638 residues. edit dna1

The above output shows us that xleap added a total of 2638 water molecules to form a rectangular box of 44.5 x 46.0 x 58.9 angstroms (120593.3 angstroms3). This is not cubic since DNA is a cylindrical molecule. An issue here is that the long axis of DNA could rotate (via self diffusion) such that the long axis was along the short box dimension which will, since this box will be infinitely repeated in space by the periodic boundary method, bring the ends of the DNA near their periodic images. One way to get around this would be to make the box cubic, or 58.9 x 58.9 x 58.9 angstroms, by specifying a list of numbers to the solvateBox command to force this to be cubic. However, this will add significantly显著的 more water to the calculation and slow it down tremendously. Alternatively we can use a different shape box of water. While a rectangular box is the obvious choice for tessellating in 3 dimensional space it is not the only shape that can be replicated in 3 dimensions. A more efficient shape to use, in terms of reducing the problem of solute rotation, and the one we will be using for this tutorial, is a truncated octahedron:

To add a truncated octahedral box of water around our DNA we use the solvateoct command. Since in the course of this demonstration we have already solvated our \the following in xleap to create the water box:

solvateoct dna2 TIP3PBOX 8.0

This should give the following output:

> solvateoct dna2 TIP3PBOX 8.0

Scaling up box by a factor of 1.320477 to meet diagonal cut criterion Solute vdw bounding box: 24.918 26.588 39.153 Total bounding box for atom centers: 60.281 60.281 60.281 (box expansion for 'iso' is 65.4%)

Solvent unit box: 18.774 18.774 18.774 Volume: 115065.025 A^3 (oct)

Total mass 59917.340 amu, Density 0.865 g/cc Added 2968 residues.

Editing \

edit dna2

There you have it, a truncated octahedral shaped ice cube...

Once again we save our AMBER prmtop and inpcrd files:

saveamberparm dna2 polyAT_wat.prmtop polyAT_wat.inpcrd

Output files: polyAT_wat.prmtop, polyAT_wat.inpcrd

Now we have our input files we can progress前进 to the next section which introduces running minimization and molecular dynamics..

3) Running Minimization and Molecular Dynamics (in vacuo)

This section will introduce sander and show how it can be used for minimization and molecular dynamics of our previously created DNA models. We will initially run our simulations on the in vacuo model and analyze the results before moving on to running simulations with implicit and explicit solvent. For this section of the tutorial we shall be using the in vacuo prmtop and inpcrd files we created previously (polyAT_vac.prmtop and polyAT_vac.inpcrd).

This section of the tutorial will consist of 3 stages:

Relaxing the system prior to MD

Molecular dynamics at constant temperature Analyzing the results

3.1) Relaxing the System Prior to MD

In the previous section we used NAB to build us a starting structure. Since this %using and may also result in conflicts and overlaps with atoms in other residues, it is always a good idea to minimize the structure before commencing molecular dynamics. Failure to successfully minimize the structure may lead to instabilities when we run MD.

So, given the in vacuo prmtop (polyAT_vac.prmtop) and inpcrd (polyAT_vac.inpcrd) files we created, we will now use sander to conduct a short minimization run. Since we just want to %up\the positions of the atoms in order to remove any bad

contacts that may lead to unstable molecular dynamics we will run a short (500 steps) minimization. This will take us towards the closest local minima. Minimization with sander will only ever take you to the nearest minima, it cannot cross transition states to reach lower minima. This is fine for our purposes, however, since all we want to do is remove the largest strains in the system.

The basic usage for sander is as follows:

sander [-O] -i mdin -o mdout -p prmtop -c inpcrd -r restrt [-ref refc] [-x mdcrd] [-v mdvel] [-e mden] [-inf mdinfo]

Arguments in []'s are optional

If an argument is not specified, the default name will be used.

-O overwrite all output files (the default behavior is to quit if any output files already exist)

-i the name of the input file (which describes the simulation options), mdin by default.

-o the name of the output file, mdout by default. -p the parameter/topology file, prmtop by default.

-c the set of initial coordinates for this run, inpcrd by default.

-r the final set of coordinates from this MD or minimization run, restrt by default.

-ref reference coordinates for positional restraints, if this option is specified in the input file, refc by default.

-x the molecular dynamics trajectory file (if running MD), mdcrd by default. -v the molecular dynamics velocities file (if running MD), mdvel by default. -e a summary file of the energies (if running MD), mden by default.

-inf a summary file written every time energy information is printed in the output file for the current step of the minimization of MD, useful for checking on the progress of a simulation, mdinfo by default.

3.1.1) Building the mdin input file

Now that we have the prmtop and inpcrd files from xleap, all we need to run sander is the mdin file which specifies the myriad of possible options for this run.

The run time input to control sander is via \manual) specified in the mdin file. For example:

Test run 1 &cntrl

IMIN = 1, NCYC = 250, MAXCYC = 500 /

In the absence of a variable specification in the input file, default values are chosen; every specified variable, except the last one, needs to be followed by a comma. The sander manual describes all of these inputs, for each of the possible namelists. Which namelist is used depends on the specification above, such as &cntrl or &ewald. At a minimum the &cntrl namelist must be specified. Also notice the space or empty first column before specification of the namelist control variable; this is necessary. It is also necessary to end each namelist with a forward slash /. Other namelists (such as the &ewald namelist) are optional. After the namelist some other information may be specified, such as \that the input variable and namelists have changed somewhat from earlier versions of AMBER. Refer to the sander section of the manual for more information.

Next we will build a minimal input file for performing minimization of our DNA. In theory it would probably be best to run a dual stage minimization where we initially use position restraints on all the heavy atoms so that in the first stage of minimization only the hydrogens that xleap added are minimized. Then in the second stage we allow minimization of all atoms in the system. Since our system is fairly small and simplistic it should be fine to skip the first stage and just minimize everything. An example of such a two stage minimization approach will be given when we run simulations on our more complex solvated model in the next section.

To create our input file: To turn on minimization, we specify IMIN = 1. We want a fairly short minimization since we don't actually need to reach the minima, just move away from any local maxima, so we select 500 steps of minimization by specifying MAXCYC = 500. Sander supports a number of different minimization algorithms, the most commonly used being steepest descent and conjugate gradient. The steepest descent algorithm is good for quickly removing the largest strains in the system but converges slowly when close to a minima. Here the conjugate gradient method is more efficient. The use of these two algorithms can be controlled using the NCYC flag. If NCYC < MAXCYC sander will use the steepest descent algorithm for the first NCYC steps before switching to the conjugate gradient algorithm for the remaining (MAXCYC - NCYC) steps. In this case we will run an equal number of steps with each algorithm so we set NCYC = 250. Since sander assumes that the system is periodic by default we need to explicitly turn this off (NTB = 0). In this simulation we will be using a constant dielectric and not an implicit (or explicit) solvent model so we set IGB = 0 (no generalized born solvation model), this is the default so we strictly don't need to specify this but I will include it here so that we can see what differences in the input file we have when we switch on implicit solvent later. We also need to choose a value for the non-bonded cut off. A larger cut off introduces less error in the non-bonded force evaluation but increases the computational complexity and thus calculation time. 12 angstroms would seem like a reasonable tradeoff for gas phase so that is what we will initially use (CUT = 12). Here is what the input file looks like:

polyA-polyT 10-mer: initial minimization prior to MD &cntrl

imin = 1, maxcyc = 500, ncyc = 250, ntb = 0, igb = 0, cut = 12 /

So, now to run sander for minimization.

3.1.2) Running sander for the first time

To run sander, we simply execute the following:

$AMBERHOME/exe/sander -O -i polyAT_vac_init_min.in -o polyAT_vac_init_min.out -c polyAT_vac.inpcrd -p polyAT_vac.prmtop -r polyAT_vac_init_min.rst

Input files: polyAT_vac_init_min.in, polyAT_vac.inpcrd, polyAT_vac.prmtop Output files: polyAT_vac_init_min.out, polyAT_vac_init_min.rst

This should run very quickly. On a 3GHz Pentium 4 machine it takes about 4.3 seconds

Take a look at the output file produced during the minimization (polyAT_vac_init_min.out). You will see that the energy dropped considerably between the first and last steps:

NSTEP ENERGY RMS GMAX NAME NUMBER

1 9.2064E+01 4.9542E+01 6.8763E+02 O5' 321

NSTEP ENERGY RMS GMAX NAME NUMBER

500 -2.2929E+03 5.7810E-01 3.4109E+00 N1 362

In spite of this, however, the structure did not change very much. This is because, as already mentioned, minimization will only find the nearest local minima. If you create PDB files for the start (polyAT_vac.inpcrd) and final structures (polyAT_vac_init_min.rst) and compare the root-mean-squared deviations (RMSd),

you will see that the two structures differ by only about 0.5 angstroms (all atom).

Initial structure = Green, Minimized structure = Blue

3.1.3) Creating PDB files from the AMBER coordinate files

You may want to generate a new pdb file so you can look at the structure using the minimized coordinates from polyAT_vac_init_min.rst. A pdb file can be created from the parm topology and coordinates (inpcrd or restrt) using the program ambpdb.

$AMBERHOME/exe/ambpdb -p polyAT_vac.prmtop < polyAT_vac_init_min.rst > polyAT_vac_init_min.pdb

This will take the prmtop and inpcrd file specified (in this case the final structure from the minimization, the .rst file) and create (on stdout) a pdb file. In this case we redirected stdout to the file - polyAT_vac_init_min.pdb.

In general, users should carefully inspect any starting structures and minimized structures; specifically check to make sure the hydrogens were placed where you thought they should be, histidines are in the correct protonation state, the terminal residues are properly terminated, stereochemistry is reasonable, etc. There is nothing worse than finding out after you've run a nanosecond of solvated dynamics that an H1' atom was on the wrong side and that you have simulated some strange anomer of DNA!

3.2) Running MD in-vacuo

In section 3.1 we used sander to minimize our system in order to remove any bad contacts introduced by the hydrogenation step in xleap. This led to the creation of the coordinate file named polyAT_vac_init_min.rst. For the next section of this tutorial we will use this coordinate file as the starting structure for our in-vacuo MD simulation. The following in-vacuo simulation is designed to give a flavor of how MD simulations are run. In general one would not run an in-vacuo simulation unless the system was inherently gas phase. For liquid phase systems, such as our DNA, one would normally include solvent either explicitly or implicitly. This type of simulation

will be covered in section 4. For the time being we will stick to gas phase simulations since they are simple and quick to run.

To make the calculations tractable for the purposes of this tutorial, we can only run short simulations of the order of 100 ps. Although these are \terms of what is likely required to answer specific research questions), you will see that they are still rather costly, depending on what type of machine you using. However, to get a better idea of the issues involved and potential artifacts of the various models that will be used (for the simulation of DNA) it is probably useful to run through these examples, either running them yourself or just looking at the output files and trajectories supplied. You may also want to try some different examples from the two given below, such as a distance dependent dielectric constant, or see what happens if you increase the dielectric constant to say 80.0.

For this tutorial we will run two in-vacuo simulations and compare the results. The simulations we will run will be:

polyAT_vac_md1_12Acut.in: 12.0 angstrom long range cutoff, dielectric = 1 polyAT_vac_md1_nocut.in: no long range cutoff, dielectric = 1

To run a molecular dynamics simulation with sander, we need to turn off minimization (IMIN=0). Since we are running in-vacuo we also need to disable periodicity (NTB=0) and set IGB=0 since we are not using implicit solvent. For these two examples we will write information to the output file and trajectory coordinates file every 100 steps (NTPR=100, NTWX=100). For most simulations writing to the coordinate file every 100 steps is too frequently, it both consumes excess disk space and can impact performance, however, this simulation here will show some interesting behavior over a short time scale that I would like you to see and so we will use a value of 100 to ensure that this is suitably sampled. For multiple nano-second simulations of stable systems a more suitable value for NTWX would be in the 1000 to 2000 range. The CUT variable specifies the cut off range for the long range non-bonded interactions. Here two different values for the cutoff will be used, one run will be with a cut off of 12 angstroms (CUT = 12.0) and one run will be without a cutoff. To run without a cutoff we simply set CUT to be larger than the extent of the system (e.g. CUT = 999). For temperature regulation we will use the Langevin thermostat (NTT=3) to maintain the temperature of our system at 300 K. This temperature control method uses Langevin dynamics with a collision frequency given by GAMMA_LN. This temperature control method is significantly more efficient at equilibrating the system temperature than the Berendsen temperature coupling scheme (NTT=1) that was the recommended method for older versions of AMBER. The biggest problem with the Berendsen method is that the algorithm simply ensures that the kinetic energy is appropriate for the desired temperature; it does nothing to ensure that the temperature is even over all parts of the molecule. This can lead to the phenomenon of hot solvent, cold solute. To avoid this, elaborate temperature scaling techniques for slowly heating the molecule over the course of the simulation were

recommended. The Langevin system is much more efficient, however, at equilibrating the temperature and is now the recommended choice. The efficiency is such that if we have a reasonably good structure, which we do in this case, we can actually start the system at 300 K and avoid the need to slowly heat over say 20 ps from 0 K to room temperature. Use the Langevin temperature regulation scheme with care, however, since while it will allow you to equilibrate the temperature of your system efficiently it will alter the fast dynamics of your system. As such if you are interested in things such as correlation functions then it is often better to equilibrate your system using ntt=3 and then, once equilibrated pure Newtonian dynamics (ntt=0). It should also be noted that the performance of NTT=3 is less than a NVT simulation using NTT=1 or a NVE simulation. Hence it can often be beneficial to heat in NVT with NTT=3, equilibrate in NPT with NTT=3 and then run production as either NVE (NTT=0) or as NVT with NTT=1 and a weak coupling constant of say 10 ps-1. The exact choice though will depend on what you aim to learn from your simulation and so you should consult the literature for previous examples and discussions.

For simplicity we will run this entire simulation with NTT=3 and GAMMA_LN=1. We shall also set the initial and final temperatures to 300 K (TEMPI=300.0, TEMP0=300.0) which will mean our system's temperature should remain around 300 K. In these two examples we will run a total of 100,000 steps each (NSTLIM=100000) with a 1 fs time step (DT=0.001) giving simulation lengths of 100 ps (100,000 x 1fs). The two input files are shown below:

polyAT_vac_md1_12Acut.in polyAT_vac_md1_nocut.in 10-mer DNA MD in-vacuo, 12 angstrom cut off &cntrl

imin = 0, ntb = 0,

igb = 0, ntpr = 100, ntwx = 100, ntt = 3, gamma_ln = 1.0, tempi = 300.0, temp0 = 300.0 nstlim = 100000, dt = 0.001, cut = 12.0 /

10-mer DNA MD in-vacuo, infinite cut off &cntrl

imin = 0, ntb = 0,

igb = 0, ntpr = 100, ntwx = 100, ntt = 3, gamma_ln = 1.0, tempi = 300.0, temp0 = 300.0 nstlim = 100000, dt = 0.001, cut = 999 /

Caution: In the examples in this tutorial we do not change the value of the random seed used for the random number generator. This is controlled by the namelist variable ig. This is largely for issues of reproducibility of the results within a tutorial setting. However, when running production simulations, especially when using ntt=2 or 3 (Anderson or Langevin thermostats) it is essential that you change the random number seed from the default on EVERY MD restart. If you are using AMBER 10 (bugfix.26 or later) or AMBER 11 or later you can do this automatically by setting ig=-1 in the cntrl namelist. Otherwise you can specify a positive random number of your choosing for ig each time you restart a calculation. For more details on the pitfalls of not doing this you should refer to the following publication:

Cerutti DS, Duke, B., et al., \Vulnerability in Popular Molecular Dynamics Packages Concerning Langevin and Andersen Dynamics\

So, to run the two jobs we issue the following two commands. Note, if you have a dual (or more) processor machine then you can run both jobs simultaneously, otherwise you should run them sequentially (note we use the restrt file from the minimization as our starting structure):

$AMBERHOME/exe/sander -O -i polyAT_vac_md1_12Acut.in -o polyAT_vac_md1_12Acut.out -c polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_12Acut.rst -x polyAT_vac_md1_12Acut.mdcrd

$AMBERHOME/exe/sander -O -i polyAT_vac_md1_nocut.in -o polyAT_vac_md1_nocut.out -c polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_nocut.rst -x polyAT_vac_md1_nocut.mdcrd

These will take considerably longer to run than the minimization, so it is probably a good time to go off and get a cup of coffee. The 12 angstrom cut off run takes about 900s (15 mins) on a single P4 3GHz machine, if it is taking too long on your machine you can always reduce the simulation length by lowering NSTLIM to say 10,000 (10 ps). You can follow the progress of the job by following the output file with the following command:

tail -f polyAT_vac_md1_12Acut.out

Input files: polyAT_vac_md1_12Acut.in, polyAT_vac_md1_nocut.in, polyAT_vac_init_min.rst, polyAT_vac.prmtop

Output files: polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out, polyAT_vac_md1_12Acut.rst, polyAT_vac_md1_nocut.rst,

polyAT_vac_md1_12Acut.mdcrd, polyAT_vac_md1_nocut.mdcrd

Warning: the mdcrd files are over 15MB in size - you can download gzip compressed (5.3MB) versions here (polyAT_vac_md1_12Acut.mdcrd.gz, polyAT_vac_md1_nocut.mdcrd.gz). You will need to unzip them before use. e.g gunzip polyAT_vac_md1_12Acut.mdcrd.gz.

Note that we now have an additional option, the -x option which specifies the name of the output file for the MD trajectory. This file will contain a snapshot of the entire system's Cartesian coordinates every NTWX (100) steps.

Please be aware that the nocut simulation will likely stop after around 22,900 steps. This is not a bug, it is a problem with simulating DNA in vacuo which will become clear when we visualize the trajectory files. More on this later.

3.3) Analyzing the results

So, now we've run the simulations what do we want to look at? First off, we may want to look through the mdout files (polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out). It is also a good idea to extract the various energies as a function of time to plot total, kinetic, potential energies etc. We might also want to calculate RMSd vs. time, etc. To look at movies (which is much more fun :-) ) we can use a visualization program such as vmd to load up the \prmtop and MD trajectory) and use the animation controls to view it, more on this later.

3.3.1) Extracting the energies, etc. from the mdout file

One can write an awk or sed script to pull out the energies, etc. These can be pulled out of the mden file if energy information was written out (which we didn't do) or directly from the mdout file. The following is a perl script (process_mdout.perl) that has been designed to process mdout files and create a series of files with the various different pieces of information in them. The program uses a default output filename so it is best to create a sub directory for each of your output files and move to there before running the script e.g.

mkdir polyAT_vac_md1_12Acut cd polyAT_vac_md1_12Acut

perl process_mdout.perl ../polyAT_vac_md1_12Acut.out

You should repeat this process for the nocut output file, even though the simulation did not complete all 100,000 steps:

mkdir polyAT_vac_md1_nocut cd polyAT_vac_md1_nocut

process_mdout.perl ../polyAT_vac_md1_nocut.out

This script takes a series of mdout files and will create a series, leading off with the prefix \such as \of output files. These files are just columns of the time vs. the value for each of the energy components. Tar archives of the summary files for each of the two MD runs above are available here (polyAT_vac_md1_12Acut.summary.tar.gz, polyAT_vac_md1_nocut.summary.tar.gz).

You can plot the summary files with your favorite graphing program. A useful program for just quickly looking at a plot of a file is xmgr which was installed on older versions of Linux or xmgrace which should be on newer installations (if it isn't then just simply using a plotting package of your preference). xmgrace is a very simple plotting program, but ideal for our purpose. To plot the total potential energy, of both runs, as a function of time we run, assuming we are in our master simulation directory:

xmgrace ./polyAT_vac_md1_12Acut/summary.EPTOT ./polyAT_vac_md1_nocut/summary.EPTOT

This should give you a plot similar to the following:

The black line represents the potential energy for the 12 angstrom cutoff simulation vs. time while the red line represents the potential energy for the simulation without a cutoff. As we can see the 12 angstrom cutoff simulation seems to be fairly stable, the potential energy is fluctuating around a constant mean value. The simulation without a cutoff, however, is significantly different. To begin with the potential energy is some 3,000 kcal/mol higher than the 12 angstrom cutoff simulation. The potential energy is so large in fact that it is actually positive. The potential energy also decreases rapidly over the first 10 ps of simulation suggesting that a large structural change is occurring. The simulation then ends abruptly after 21.9ps with the following error:

Frac coord min, max: -5.132892457340335E-005 0.962793346753183 The system has extended beyond the extent of the virtual box. Restarting sander will recalculate

a new virtual box with 30 Angstroms extra on each side, if there is a restart file for this configuration.

SANDER BOMB in subroutine Routine: map_coords (ew_force.f) Atom out of bounds. If a restart has been written, restarting should resolve the error

Why the difference in potential energy? Well, by looking at our two output files this can be tracked down to the difference in the electrostatic energy. For the first step we have:

12 angstom cutoff

NSTEP = 0 TIME(PS) = 0.000 TEMP(K) = 298.54 PRESS = 0.0

Etot = -1726.9350 EKtot = 565.9592 EPtot = -2292.8942

BOND = 16.0918 ANGLE = 93.9850 DIHED = 392.9956

1-4 NB = 162.1781 1-4 EEL = -345.0298 VDWAALS = -346.6892

EELEC = -2266.4257 EHBOND = 0.0000 RESTRAINT = 0.0000 no cutoff

NSTEP = 0 TIME(PS) = 0.000 TEMP(K) = 298.54 PRESS = 0.0

Etot = 1885.9795 EKtot = 565.9592 EPtot = 1320.0203

BOND = 16.0918 ANGLE = 93.9850 DIHED = 392.9956

1-4 NB = 162.1781 1-4 EEL = -345.0298 VDWAALS = -349.3700

EELEC = 1349.1697 EHBOND = 0.0000 RESTRAINT = 0.0000

Notice how the value of EELEC is so much larger in the no cutoff simulation.

So, what has happened and how can we learn more? Well, first of all let's look at another type of analysis we can perform on the results, in this case analysis of the MD trajectory files (polyAT_vac_md1_12Acut.mdcrd, polyAT_vac_md1_nocut.mdcrd). From this we can hopefully understand what is happening.

3.3.2) Calculating the RMSd vs. time

The next step in analyzing our results is to calculate the RMSd as a function of time using ptraj (an analysis program provided with AmberTools). The precise parameter we will be calculating in this example is the mass weighted RMSd fit between each successive structure and the first structure of our trajectory. This is done by providing an input file to ptraj containing a list of commands describing the relevant files and what we want it to do etc.:

trajin polyAT_vac_md1_12Acut.mdcrd

rms first mass out polyAT_vac_md1_12Acut.rms time 0.1

where trajin specifies the name of the trajectory file to process, rms first mass, tells ptraj that we want it to calculate a MASS weighted, RMS fit to the FIRST structure. out specifies the name of the file to write the results to and time 0.1 tells ptraj that each frame of the coordinate file represents 0.1 ps. Since we have two simulations we have two input files, with different trajectory files and output files in each, polyAT_vac_md1_12Acut.calc_rms, polyAT_vac_md1_nocut.calc_rms.

Run the two ptraj jobs with the following commands:

$AMBERHOME/exe/ptraj polyAT_vac.prmtop < polyAT_vac_md1_12Acut.calc_rms

$AMBERHOME/exe/ptraj polyAT_vac.prmtop < polyAT_vac_md1_nocut.calc_rms

The output from each job will look something like this:

...read in prmtop file

Read in control variables Read in atom names... Read in charges... Read in masses...

Read in IAC (atoms involved in L-J)...

Read in NUMEX (index to excl atom list)... Read in NNO (index for nonbond of @type)... Read in residue labels...

Read in the residue to atom pointer list... Read in bond parameters RK and REQ... Read in angle parameters TK and TEQ...

Read in dihedral parameters PK, PN and PHASE... Read in SOLTY...

Read in L-J parameters CN1 and CN2... Read in info for bonds w/ hydrogen... Read in info for bonds w/out hydrogen... Read in info for angles w/ hydrogen... Read in info for angles w/out hydrogen... Read in info for dihedrals w/ hydrogen... Read in info for dihedrals w/out hydrogen... Read in excluded atom list...

Read in h-bond parameters: AG, BG, and HBCUT... Read in atomic symbols (types)... Read in tree information... Read in the JOIN info... Read in the IROTAT info...

Checking coordinates: polyAT_vac_md1_12Acut.mdcrd Print summary of the residues:

Amber9 Module: ptraj

DA5 DA DA DA DA DA DA DA DA DA3 DT5 DT DT DT DT DT DT DT DT DT3 Successfully completed readParm.

Process the input file and state what is to be done: PTRAJ: Processing input file...

Input is from standard input

PTRAJ: trajin polyAT_vac_md1_12Acut.mdcrd

PTRAJ: rms first mass out polyAT_vac_md1_12Acut.rms time 0.1 Mask [*] represents 638 atoms

FYI: No output trajectory specified (trajout), none will be saved.

PTRAJ: Successfully read the input file.

Coordinate processing will occur on 1000 frames. Summary of I/O and actions follows:

Print out a summary of what the input and output files are and the list of actions to be performed on every coordinate set. In this case, we only have a single input coordinate file and a single action, i.e. rms: INPUT COORDINATE FILES

File (polyAT_vac_md1_12Acut.mdcrd) is an AMBER trajectory with 1000 sets

OUTPUT COORDINATE FILE NULL entry

ACTIONS

1> RMS to first frame using mass weighting

Dumping RMSd vs. time (with time interval 0.10) to a file named polyAT_vac_md1_12Acut.rms

Atom selection follows * (All atoms are selected)

As the program runs it prints out a dot for every frame of the trajectory file that has been processed:

Processing AMBER trajectory file polyAT_vac_md1_12Acut.mdcrd Set 1 ................................................. Set 50 ................................................. Set 100 ................................................. Set 150 ................................................. Set 200 ................................................. Set 250 ................................................. Set 300 ................................................. Set 350 ................................................. Set 400 ................................................. Set 450 ................................................. Set 500 ................................................. Set 550 ................................................. Set 600 ................................................. Set 650 ................................................. Set 700 ................................................. Set 750 ................................................. Set 800 ................................................. Set 850 ................................................. Set 900 ................................................. Set 950 ................................................. Set 1000

ERROR in readAmberTrajectory(): Set #1001 is corrupted (

This is not really an error as this comment is printed out at the end of the trajectory file (1000 frames for the 12A cutoff simulation, ~229 frames for the no cutoff simulation).

PTRAJ: Successfully read in 1000 sets and processed 1000 sets. Dumping accumulated results (if any)

PTRAJ RMS: dumping RMSd vs time data

We should get two output files, one for each simulation (polyAT_vac_md1_12Acut.rms, polyAT_vac_md1_nocut.rms).

We can now plot the RMSd's vs time. Note, RMSd is in angstroms and time is in ps.

xmgrace polyAT_vac_md1_12Acut.rms polyAT_vac_md1_nocut.rms

This should give a plot similar to the following:

本文来源:https://www.bwwdw.com/article/d992.html

Top