A Starter's tutorial
In this tutorial we will build a hydrogen molecule atom by atom, load another hydrogen molecule from file. We will recognize the bond structure and dissect the system into two separate molecules. We will tesselate one molecule. And eventually, we will do a correlation analysis. Note that this tutorial will not go into the details of the code, it will just explain what to do and what happens using molecuilder on the command line.
We assume throughout this tutorial that your main folder is molecuilder, that you have created a subfolder called build wherein the compilation took place and the executable's path is then molecuilder/build/src/molecuilder. We assume that you have entered the build directory before starting this tutorial, hence molecuilder can be launched by entering ./src/molecuilder.
It is helpful to visually inspect the outputs of molecuilder at each step, http://jmol.sourceforge.net/ Jmol or http://www.ks.uiuc.edu/Research/vmd/ vmd may be helpful for this purpose.
Building hydrogen
Execute:
./src/molecuilder -i test.xyz -o xyz -a 6 --position "0, 0, 0"
This will add an "oxygen" atom (atomic number is actually 8 but we correct the error further below) at (0,0,0). Note that we have called the input file test, which will also be the root of all output files (besides the suffix). We create a xyz output file. It is a very common format with many molecule viewers such as listed exemplarily above. Furtheron, we don't have to specify xyz anymore, as test.xyz that we parse and add stuff on top is of xyz format and will get updated automatically.
Instead of -a we could have written --add-atom as well. Almost all actions have this kind of short-form, options such as --position however never have a short-form. This makes them distinguishable from the actions.
We add the two remaining hydrogen atoms:
./src/molecuilder -i test.xyz -a 1 --position "0.758602, 0., 0.504284" ./src/molecuilder -i test.xyz -a 1 --position "0.758602, 0., -0.504284"
As of now, "equal" actions cannot be stacked but this will come very soon. Different actions can however be sequenced on the command line. This is also the reason, why -o has to be placed in front of -a as otherwise the output files will be present but empty.
However, you will have noticed that the second command fails with the following error:
ERROR: Vector (0.758602,0,-0.504284) would be outside of box domain.
This is because the default simulation domain is defined by the 3x3 matrix (20,0,0,0,20,0,0,0,20). Our given position is negative in the z component, is outside of the box and thus justly admonished. Let's shift the system by 5 angstroem and try again:
./src/molecuilder -i test.xyz --select-all-atoms -t "0., 0., 5." ./src/molecuilder -i test.xyz -a 1 --position "0.758602, 0., 4.495716"
You notice that we had to select-all-atoms to tell molecuilder which atoms to translate. You can specify very selectively by id, by element, ... and also molecules as you'll see lateron.
Let's briefly fix atom 0's element, i.e. convert it from carbon to oxygen.
./src/molecuilder -i test.xyz --select-atom-by-id 0 -E 8
Here, we have again first selected the desired atom by its id, then specified the action on it. This will change the element of atom with index 0 to oxygen(8), i.e. make it an oxygen.
Let's assume we would like to change it back to carbon, but actually we don't ... let's undo :)
./src/molecuilder -i test.xyz --select-atom-by-id 0 -E 6 --undo
Here, we change the atom with index 0 to element carbon(6) but undo this action right afterwards. (Almost) every action has an Undo and a Redo (if you twist your decision once more).
Changing the simulation domain
The simulation domain has been mentioned before. It is not encoded in the xyz format. In order to add atoms beyond components values of the interval [0,20), we would have the specify a new box in front of every command. Hence, we rather switch to pcp that stores the simulation domain.
There are two possibilities to do this switching, either we parse the file test.xyz in or we specify an additional output format (-o pcp). We pick the former here:
./src/molecuilder -i test.xyz -o pcp
Note that from now on we will always add the -o xyz switch such that test.xyz is updated, too. Now, we have created a pcp config file called test.conf (.conf is the suffix of the pcp format), containing all information of test.xyz along with information specific to this type of quantum mechanical solver (plane wave).
Let's continue and define the simulation box:
./src/molecuilder -i test.conf -o xyz -B "25, 0, 25, 0, 0, 25"
Note that the BoxLength entry in test.conf has changed. Note also that the sequence of the atoms has changed. This is because we parse in a pcp file and the sequence of atoms according to the one therein, not according to the xyz structure. In pcp they are always written sorted by element. This sorting has then also been used in storing of the xyz file. So, our oxygen atom that had an index of 0 above, has now an index of 2.
Regarding the positions and our domain box, we have set the water molecule a bit right on the corner. Let's shift the molecule a bit
./src/molecuilder -i test.conf -o xyz --select-molecules-atoms 0 -t "-5, 5, 0" --periodic 1
Note that we explicitely stated here to have periodic boundary conditions active. Hence, the coordinates get wrapped around, which is exactly what happened to the x component of all three atoms. Note that if we parse in a config file, all atoms are automatically put into a single molecule, here referenced by its index 0.
A better approach for making sure all atoms are inside the domain is to use the center on edge and add empty boundary action:
./src/molecuilder -i test.conf -o xyz -O -c "5, 5, 5"
This action will always produce a rectangular domain, i.e. the extension of the water molecule with an additional boundary of 5 angstroem in every direction. Notice that the BoxLength entry has changed.
Parsing in another water molecule
Adding atoms singly is now very tedious. Often one already has an xyz description of the molecule. Let's parse in another water molecule from the file water.xyz.
./src/molecuilder -i test.conf -o xyz -l water.xyz
If we would like to place it right-away somewhere else, we would use this call:
./src/molecuilder -i test.conf -o xyz -l water.xyz --select-molecules-atoms 1 -t "15, 15, 15" --periodic 1
which would translate the newly parsed-in molecule, getting index 1, by (15,15,15), adhering to periodic boundary conditions.
Use the bond structure
Now, we have fooled around a bit with the molecular system. We should briefly start from scratch and this may also serve as a test for you, measuring how well you have paid attention so far. I.e. perform the following steps
- delete test.conf
- parse in water.xyz into a now empty test.conf (hence, don't forget to extend -o xyz to -o pcp xyz for this step)
- change the box to 25,25,25
- shift the present molecule by (5,5,5)
- parse in another while shifting it by 15,15,15 at the same time.
If we then parse in the config files again, we will not have two separate molecules, addressable by indices 0 and 1, but a single one. This is bad. Let's do a graph analysis of the system
./src/molecuilder -i test.conf -v 3 -I
Note that we have specified here a verbosity level (-v) of 3 such that one sees a bit of what's going on. Somewhere along the output, you will notice Disconnected subgraph ranges from ... appears twice. These are two water molecules that have been recognized as separated: I scanned 2 molecules.
Also, the sequence of the atoms has changed again. Their affiliation with a molecule supercedes the order-by-element of the pcp type, i.e. first all atoms of molecule 1, then of molecule 2.
Tesselate the second molecule
The last step is especially needed for the following tesselation of the second molecule only:
./src/molecuilder -i test.conf -I --select-molecule-by-id 2 -N 3. --nonconvex-file NonConvexEnvelope
This will write a http://www.tecplot.com/ TecPlot conforming file NonConvexEnvelope.dat that contains two triangles right on top of each other. Simply because the water molecule is quite flat and has exactly three atoms, i.e. nodes for the boundary graph.
Pair correlation analysis
We will end this tutorial by doing some analysis on these two water molecules, namely a pair correlation
./src/molecuilder -i test.conf -v 3 --select-all-molecules --pair-correlation --elements 1 8 --output-file output-point.csv --bin-output-file bin_output-point.csv --bin-start 0 --bin-end 20
and a surface correlation analysis.
./src/molecuilder -i test.conf -v 3 -I --select-all-molecules --surface-correlation --elements 1 --output-file output-surface.csv --bin-output-file bin_output-surface.csv --bin-start 0 --bin-width 1. --bin-end 20 --molecule-by-id 2
In the respective bin_output...csv files you find the binned distances between elements hydrogen and oxygen on the one hand and between all hydrogen atoms and the surface of the second water molecule on the other hand.
Creating a graphene block with several sheets
This part will assume that you already have a file graphene.xyz with a single graphene sheet, all atoms labeled as carbon (C). (You may produce this file by using the vmd molecular viewer extension "Nanotube builder") Furthermore we assume that you wish to create a five-layered graphene block. Each of the sheets will be made of carbon, but distinguished by an index (C1, C2, ...) In order to separate intra- and interlayer interactions.
First convert the xyz file to a tremolo data file:
??? /opt/bin/molecuilder --parse-tremolo-potentials elec.potentials -i graphene_layer.data -o xyz
Then create 5 copies of the created data file:
for i in 1 2 3 4 5; do cp graphene.data graphene_C${i}.data; done
Within each file rename the carbon atoms to C1, C2, C3, ...
for i in 1 2 3 4 5; do sed -i -e "s#\(\tC\)\t#\1${i}\t#g" graphene_C${i}.data; done
Get the boundaries of the current sheet by using this script:
~/local/bin/GetBoxBounds.pl --file graphene_C1.data --offset 3
The offset is the index of the first column containing coordinate values. Remember that the script extracts the extreme positions of the particles, so it is a good idea to add some margin for bonds between the periodic images.
Finally combine all 5 files into the final data file.
for i in 1 2 3 4 5; do dist=`units "3.4 * $i" | awk -F": " {'print $2'}`; distx=`units "1.22 * $i" | awk -F": " {'print $2'}`;/opt/bin/molecuilder --parse-tremolo-potentials elec.potentials -i graphene_layer.data -o tremolo xyz -B "61.393,0,63.612,0,0,20" -l graphene_C${i}.data --select-molecule-by-order -1 --select-molecules-atoms --translate-atoms "$distx,0,$dist" --periodic 1; done
This little script performs two computations, a shift in x and in z direction, e.g. computing the variable dist: dist=units "3.4 * $i" | awk -F": " {'print $2'}
, which later can be accessed by $dist. Furthermore, since C1, C2, etc are not known as elements, a tremolo.potentials file is needed to map those shorthands to elements for internal use in molecuilder. Outputs will be created in tremolo and xyz format (-o), periodic shifts (--periodic 1) are computed with correct boxsize (-B), data is read from input files (-i), and molecules from additional loaded files (-l), the last one is selected as one molecule (-select-molecule-by-order -1), all atoms in the molcule are selected --select-molecules-atoms, they are translated x,y,z and output into (-i).
Now the file graphene_layere.data has been created and can be used in Tremolo-X Simulations.
This ends this tutorial. I hope you have got an idea of how molecuilder is put into action and what can be done.
Attachments (1)
-
water.xyz
(154 bytes
) - added by 14 years ago.
water molecule in xyz format
Download all attachments as: .zip