diff --git a/README.md b/README.md index 2ed3cc0..5c89e5e 100644 --- a/README.md +++ b/README.md @@ -15,13 +15,13 @@ Contents field for systems containing molecules, ions or extended materials. It requires the [Packmol](http://www.ime.unicamp.br/~martinez/packmol/) software - to create coordinates. The output are files in formats suitable - for the [LAMMPS](http://lammps.sandia.gov/), + to generate coordinates in the box. The output are files in formats + suitable for the [LAMMPS](http://lammps.sandia.gov/), [DL_POLY](http://www.stfc.ac.uk/CSE/randd/ccg/software/DL_POLY/25526.aspx) or [GROMACS](http://www.gromacs.org) molecular dynamics packages. -* `tools/`: several utility scripts. +* `tools/`: utility scripts. * `examples/`: examples of molecule files and force field databases. @@ -41,7 +41,7 @@ Requirements Obtaining --------- -Download the files or else clone the repository (easier to stay updated): +Download the files or clone the repository: git clone https://github.com/agiliopadua/fftool.git @@ -52,198 +52,186 @@ Tutorial These are instructions on how to build an initial configuration for a system composed of molecules, ions or materials. -1. For each molecule, ion or fragment of a material prepare a file - with atomic coordinates and/or connectivity (covalent bonds). The - formats accepted by this tool are `.zmat`, `.mol`, `.pdb` or `.xyz`. - - A `.zmat` file has the molecule name in the first line, followed - by one empy line, then the z-matrix. See the `examples` directory - and the Wikipedia entry for "Z-matrix (chemistry)". Variables can - be used in place of distances, angles and dihedrals. Connectivity - is inferred from the z-matrix by default. In this case cyclic - molecules require additional `connect` records to close - rings. Improper dihedrals can be indicated by `improper` - records. If a `reconnect` record is present, then connectivity - will be guessed based on bond distances from the force - field. After the z-matrix and the informations above, the name of a - file with force field parameters can be supplied. - - The MDL Molfile `.mol` file format contains a table with coordinates and - also bonds. The name of a file with force field parameters can be given in - the first line after the molecule name or in the third line. If the keyword - `reconnect` is present after the force field filename, then connectivity - will be deduced based on bond distances from the force field. - - The PDB file format `.pdb` is a standard for atomic coordinates of - molecules, widely used for proteins. The name of a file with force field - parameters can be given on a `COMPND` record after the molecule name. - Connectivity is deduced from the bond lengths in the force field (`CONNECT` - records are not read). - - The XYZ file format `.xyz` contains atomic coordinates only. The name of a - file with force field parameters can be given in the second line after the - molecule name and in this case connectivity is deduced from the bond lengths - in the force field. - - There are many free tools to create MDL mol files, xyz files or - z-matrices, which are common formats in computational chemistry - ([Open Babel](http://openbabel.org/), - [Avogadro](http://avogadro.cc/)). Manual editing of the files is - usually necessary in order to match the atom names with those of - the force field. - -2. Use the `fftool` script to create `.xyz` files with atomic - coordinates for the components of your system, plus an input file - for `packmol`. For help type `fftool -h`. For example, to build - a simulation box with 40 ethanol and 300 water molecules and a - density of 38.0 mol/L do: +1. For each molecule, ion or fragment of a material prepare a file with atomic + coordinates and eventually connectivity (covalent bonds). The formats + accepted by this tool are `.zmat`, `.mol`, `.pdb` or `.xyz`, which are + common formats in computational chemistry. + + A `.zmat` file has the molecule name in the first line, followed by one + empy line, then the z-matrix. See the `examples` directory and the + Wikipedia entry for "Z-matrix (chemistry)". Variables can be used in place + of distances, angles and dihedrals. Connectivity is inferred from the + z-matrix by default. In this case cyclic molecules require additional + `connect` records to close rings. Improper dihedrals can be indicated by + `improper` records. If a `reconnect` record is present, then connectivity + will be guessed based on bond distances from the force field (see + below). After the z-matrix and the informations above, the name of a file + with force field parameters can be supplied. + + The MDL Molfile `.mol` file format contains a table with coordinates and + also bonds. The name of a file with force field parameters can be given in + the first line after the molecule name or in the third line. If the keyword + `reconnect` is present after the force field filename, then connectivity + will be deduced based on bond distances from the force field. + + The PDB file format `.pdb` is widely used for proteins. The name of a file + with force field parameters can be given on a `COMPND` record after the + molecule name. Connectivity is deduced from the bond lengths in the force + field (`CONNECT` records are not read). + + The XYZ file format `.xyz` contains atomic coordinates only. The name of a + file with force field parameters can be given in the second line after the + molecule name and in this case connectivity is deduced from the bond + lengths in the force field. + + There are many tools ([Open Babel](http://openbabel.org/), + [Avogadro](http://avogadro.cc/), [VESTA](http://jp-minerals.org/vesta/en/)) + to create MDL mol files, xyz files or z-matrices. Manual editing of the + files is usually necessary in order to match the atom names with those of + the force field. + +2. Use the `fftool` script to create an input file for `packmol`, which will + use new `_pack.xyz` files with atomic coordinates for the components of + your system. For help type `fftool -h`. For example, to build a simulation + box with 40 ethanol and 300 water molecules and a density of 38.0 mol/L do: fftool 40 ethanol.zmat 300 spce.zmat -r 38.0 - Alternatively the side length of the the simulation box (cubic) in - angstroms can be supplied: + Alternatively the side length of the the simulation box (here cubic) can + be supplied in angstroms: fftool 40 ethanol.zmat 300 spce.zmat -b 20.0 3. Use `packmol` with the `pack.inp` file just created to generate the - atomic coordinates in the simulation box (adjust the density/box - size if necessary): + atomic coordinates in the simulation box: packmol < pack.inp - For more complex spatial arrangements of molecules and materials - you can modify the `pack.inp` file to suit your needs (see the - [Packmol](http://www.ime.unicamp.br/~martinez/packmol/) - documentation). Atomic coordinates for the full system will be - written to `simbox.xyz`. + Difficult convergence may indicate that density is too high, so adjust + density or box size if necessary. For more complex spatial arrangements of + molecules and materials you can modify the `pack.inp` to suit your needs + (see the [Packmol](http://www.ime.unicamp.br/~martinez/packmol/) + documentation). Atomic coordinates for the full system will be written to + `simbox.xyz`. -4. Use `fftool` to build the input files for LAMMPS (-l), DL_POLY (-d) - or GROMACS (-g) containing the force field parameters - and the coordinates of all the atoms (from `simbox.xyz`): +4. Use `fftool` to build the input files for LAMMPS (-l), DL_POLY (-d) or + GROMACS (-g) containing the force field parameters and the coordinates of + all the atoms (from `simbox.xyz`): fftool 40 ethanol.zmat 300 spce.zmat -r 38.0 -l - If no force field information was given explicitly in the molecule - files, a default LJ potential with parameters zeroed will be - assigned to atoms. No terms for bonds, angles or torsions will be - created. This is suitable when working with non-additive, - bond-order or other potentials often used for materials. The input - files for MD simulations will have to be edited manually to - include an interaction potential for the material. + If no force field information was given explicitly in the molecule files, + a default LJ potential with parameters zeroed will be assigned to + atoms. No terms for bonds, angles or torsions will be created. This is + suitable when working with non-additive, bond-order or other potentials + often used for materials. The input files for MD simulations will have to + be edited manually to include an interaction potential for the material. Deducing Bonds and Angles ------------------------- -When inferring connectivity from interatomic distances, a tolerance -of 0.25 angstrom is used to compare distances in the coordinates file -with equilibrium distances specified for bonds in the force field and -decide if a bond should be present or not. So, the bond lengths in the -conformation used as input must be sufficiently close to those in the -force field specification for those bonds to be included in the -potential energy fonction of the system. - -Angles will be assigned to groups of three atoms i-j-k, with i-j and -j-k bonded, if the value of the angle in the conformation used as -input is within +/-15 degrees of the equilibrium angle in the force -field specification. If not, even if the atoms i-j-k are bonded, their -angle will not be present in the final potential energy function, -although topologically the angle is there. When running `fftool` to -create a force field file (with `-l` or `-d` option) a warning message -will show which such topological angles have been "removed" because -they deviate too much from the equilibrium angles in the force -field. This removal of angles avoids problems with atoms that have -more than four ligands, such as S or P atoms with five or six -ligands. Around these centers there are topological angles of 180 -degrees to which no potential energy of bending is attributed in force -fields. For example, in the octahedral PF6- anion there are two -different values of F-P-F angles: twelve 90 degree angles between -adjacent F atoms, and three 180 degree angles between opposite F -atoms; only the twelve 90 degree angles contribute with a harmonic -potential energy function in the OPLS-AA force field. - -The tolerances for bond distances and angle values, 0.25 angstrom and -15 degrees, respectively, were chosen based on heuristic -judgement. They can be set by editing the `fftool` source, namely the -global variables `BondTol` and `AngleTol`. Use with care because -spurious bonds and angles may be created if the tolerances are too -wide. +When inferring connectivity from interatomic distances, distances in the +coordinates file are compared with equilibrium distances specified for bonds +in the force field and a tolerance of +/-0.25 angstrom is used to decide if a +bond should be present or not. So, the bond lengths in the conformation used +as input must be sufficiently close to those in the force field specification +for those bonds to be included in the potential energy fonction of the system. + +Angles will be assigned to groups of three atoms i-j-k, with i-j and j-k +bonded, if the value of the angle in the conformation used as input is within ++/-15 degrees of the equilibrium angle in the force field specification. If +not, even if the atoms i-j-k are bonded, their angle will not be present in +the final potential energy function, although topologically the angle is +there. When running `fftool` to create a force field file (with `-l`, `-d` or +` -g` option) a warning message will show which such topological angles have +been "removed" because they deviate too much from the equilibrium angles in +the force field. This removal of angles avoids problems with atoms that have +more than four ligands, such as S or P atoms with five or six ligands. Around +these centers there are topological angles of 180 degrees to which no +potential energy of bending is attributed in force fields. For example, in the +octahedral PF6- anion there are two different values of F-P-F angles: twelve +90 degree angles between adjacent F atoms, and three 180 degree angles between +opposite F atoms; only the twelve 90 degree angles contribute with a harmonic +potential energy function in most force fields. + +The tolerances for bond distances and angle values, 0.25 angstrom and 15 +degrees, respectively, were chosen based on judgement. They can be set by +editing the `fftool` source, namely the global variables `BondTol` and +`AngleTol`. Use with care because spurious bonds and angles may be created if +the tolerances are too large. Improper Dihedrals ------------------ -Improper dihedrals are often used to increase the rigidity of planar -atoms (sp2) and differ from proper dihedrals in how they are -defined. A proper dihedral i-j-k-l is defined between bonded atoms -i-j, j-k, and k-l and corresponds to torsion around bond j-k, the -dihedral being the angle between planes i-j-k and j-k-l. An improper -dihedral i-j-k-l is defined between bonded atoms i-k, j-k and k-l, -therefore k is a central atom bonded to the three others. This central -atom of the improper dihedral is assumed to be the third in the -list. Often in the force field the same functional form is used both -for proper and improper torsions. +Improper dihedrals are often used to increase the rigidity of planar atoms +(sp2) and differ from proper dihedrals in how they are defined. A proper +dihedral i-j-k-l is defined between bonded atoms i-j, j-k, and k-l and +corresponds to torsion around bond j-k, the dihedral being the angle between +planes i-j-k and j-k-l. An improper dihedral i-j-k-l is defined between bonded +atoms i-k, j-k and k-l, therefore k is a central atom bonded to the other +three. The central atom of the improper dihedral is assumed to be the third in +the list. Often in force fields the same potential energy function is used +both for proper and improper torsions. If `improper` records are supplied in a molecule file (in `.zmat` format) then those improper dihedrals are assumed by `fftool`. Otherwise, the script will -search for candidate improper dihedrals on all atoms with three bonds, with any -of `.zmat`, `.mol`, `.pdb` or `.xyz` input formats. A number of warning messages -will be printed if there are atoms with three bonds, which can be ignored if the -atoms in question are not centers of improper torsions. The number and order of -the atoms in the true improper dihedrals should be verified in the files -created. +search for candidate improper dihedrals on all atoms with three bonds, with +any of `.zmat`, `.mol`, `.pdb` or `.xyz` input formats. A number of warning +messages will be printed if there are atoms with three bonds, which can be +ignored if the atoms in question are not centers of improper torsions. The +number and order of the atoms in the true improper dihedrals should be +checked in the files created. Periodic Boundary Conditions ---------------------------- -In molecular systems the initial configuration will generaly not -contain molecules crossing boundaries of the simulation box. A -buffer distance of 1.5 A is reserved at the box boundaries to -avoid overlap of molecules from periodic images in the initial -configuration, as explained in the `packmol` documentation (this empty -space is added only for orthogonal boxes). So the user should allow -for this empy volume when supplying the size of the box. - -For simulations with extended materials it is possible to create -chemical bonds across boundaries. The option `-p` allows specification -of periodic conditions along x, y, z or combinations thereof. It is -important in this case to supply dimensions for the simulation box -using the option `-b l` for a cubic box, or `-b lx,ly,lz` for a -general orthogonal box, or `-b a,b,c,alpha,beta,gamma` for a general -parallelepipedic (triclinic) box. An energy minimization step prior to -the start of the MD simulation is highly recommended because no extra -space is left near the boundaries and certain molecules may overlap -with those of neighboring images. - -The coordinates of the atoms of the material have to be supplied in -`.xyz` format and prepared carefully so that distances across periodic -boundaries are within the tolerance to identify bonds. The number of -bonds in the output files created should be verified. - -It is important that only the material for which bonds are to be -established across boudaries is supplied in `.xyz` format. The initial -files for other molecules in the system should be in `.zmat` or `.mol` -formats, which contain connectivity information. This is to avoid -spurious bonds between atoms of the molecular species that happen to -be positioned too close to boudaries. - -The `pack.inp` file will likely need manual editing in order to -position the atoms of the material precisely. - +In molecular systems the initial configuration will generaly not contain +molecules crossing boundaries of the simulation box. A buffer distance of 1.5 +angstrom is reserved at the box boundaries to avoid overlap of molecules from +periodic images in the initial configuration, as explained in the `packmol` +documentation (this empty space is added by `fftool` only for orthogonal +boxes). So the user should allow for this empty volume when supplying the size +of the box. + +For simulations with extended materials it is possible to create chemical +bonds across boundaries. The option `-p` allows specification of periodic +conditions along x, y, z or combinations thereof. It is important in this case +to supply dimensions for the simulation box using the option `-b ` for a +cubic box, or `-b ` for a general orthogonal box, or `-b +` for a general parallelepiped (triclinic box). An +energy minimization step prior to the start of the MD simulation is highly +recommended because no extra space is left near the boundaries and certain +molecules may overlap with those of neighboring images. + +The coordinates of the atoms of the material have to be supplied in `.xyz` +format and prepared carefully so that distances across periodic boundaries are +within the tolerance to identify bonds. The number of bonds in the output +files created should be checked. + +It is important that only the material for which bonds are to be established +across boudaries is supplied in `.xyz` format. The initial files for other +molecules in the system should be in `.zmat` or `.mol` formats, which contain +connectivity information. This is to avoid spurious bonds between atoms of the +molecular species that happen to be positioned too close to boundaries. + +The `pack.inp` file will likely need manual editing in order to position the +atoms of the material precisely. Force Field File Format ----------------------- -The `fftool` script reads a database of molecular force field terms -in the format described below. See the `examples` directory. +The `fftool` script reads a database of molecular force field terms in the +format described below. See the `examples` directory. Blank lines and lines starting with `#` are ignored. -There are five sections, with headings `ATOMS`, `BONDS`, `ANGLES`, -`DIHEDRALS` and `IMPROPER`. Under each section heading, registers -concerning the different types of term in the force field are given. +There are five sections, with headings `ATOMS`, `BONDS`, `ANGLES`, `DIHEDRALS` +and `IMPROPER`. Under each section heading, registers concerning the different +types of term in the force field are given. `ATOMS` records describe, for each type of atom: - the non-bonded atom type used for intermolecular interactions (these