<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!--**************************************************************************** * * * Viewmol * * * * N O D E 3 2 . H T M L * * * * Copyright (c) Joerg-R. Hill, October 2003 * * * ******************************************************************************** *--> <html> <head> <title>14 Programming Your Own Input Filter</title> <META NAME="description" CONTENT="14 Programming Your Own Input Filter"> <META NAME="keywords" CONTENT="viewmol"> <META NAME="resource-type" CONTENT="document"> <META NAME="distribution" CONTENT="global"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <link rel="STYLESHEET" href="viewmol.css"> <link rel="first" href="viewmol.html"> </head> <body> <H1><A NAME="SECTION0001400000000000000000"> 14 Programming Your Own Input Filter</A> </H1> V<SMALL>IEWMOL</SMALL> can be easily adapted to read outputs of other programs or other file formats. All you have to do is to write a new input filter which extracts the data from the corresponding file. These input filters are stand-alone programs and can be written in every programming language you want. Examples in C and awk are included. <P> The input filter has to read the following data from the output file and write them to its standard output in the format described below. This format follows the file format of T<SMALL>URBOMOLE</SMALL> very closely. A few sections had to be extended to allow data which is currently not supported by T<SMALL>URBOMOLE</SMALL> (e. g. unit cells). <UL> <LI>the cartesian coordinates and atom symbols (required) <BR> Write to standard output in the following format: <dl><dd><pre class="verbatim"> $coord factor x1 y1 z1 symbol1 xyz x2 y2 z2 symbol2 xyz ... </pre></dl> <code>factor</code> is the conversion factor the coordinates have to be multiplied with to convert them to Ångstrøms. Any combination of x, y, and z at the end of the line (optional) indicates that the corresponding atom has been kept fixed in that direction during a geometry optimization. Consequently, V<SMALL>IEWMOL</SMALL> will not draw the forces acting on this atom in the fixed direction. </LI> <LI>the title (optional) <BR> Write to standard output in the following format: <dl><dd><pre class="verbatim"> $title title </pre></dl> </LI> <LI>the wave numbers and intensities (optional) <BR> Write to standard output in the following format: <dl><dd><pre class="verbatim"> $vibrational spectrum symmetry1 wavenumber1 IR-intensity1 Raman-intensity1 symmetry2 wavenumber2 IR-intensity2 Raman-intensity2 ... </pre></dl> <code>symmetry</code> is the symmetry label for the vibrational mode, <code>wavenumber</code> is its wave number and <code>IR-intensity</code> and <code>Raman-intensity</code> are its IR and Raman intensity, respectively. If the symmetry labels for the vibrational modes are unknown they should be set to a default (e. g. A1). </LI> <LI>normal coordinates (optional) <BR> Write to standard output in the following format: <dl><dd><pre class="verbatim"> $vibrational normal modes i1 i2 nm(1,1) nm(2,1) nm(3,1) nm(4,1) nm(5,1) i1 i2 nm(6,1) ... nm(3*natom,1) i1 i2 nm(1,2) nm(2,2) nm(3,2) nm(4,2) nm(5,2) i1 i2 nm(6,2) ... nm(3*natom,2) ... i1 i2 nm(1,nmodes) ... nm(5,nmodes) i1 i2 nm(6,nmodes) ... nm(3*natom,nmodes) </pre></dl> <code>i1</code> and <code>i2</code> are integers which are skipped during reading. <code>nm(i,j)</code> are the normal mode coefficients. They have to be provided ordered by cartesian coordinates (all x components of the first atom first, then all y components of the first atom etc.). </LI> <LI>optimization history or MD trajectory (optional) <BR> Write to standard output in the following format: <dl><dd><pre class="verbatim"> $grad factor cycle = nc SCF energy = E_nc |dE/dxyz| = gradnorm_nc [unitcell a b c alpha beta gamma] [unitcell vectors xa ya za xb yb zb xc yc zc] x1 y1 z1 symbol1 x2 y2 z2 symbol2 ... xn yn zn symboln gx1 gy1 gz1 gx2 gy2 gz2 ... gxn gyn gzn cycle = nc+1 SCF energy = E_nc+1 |dE/dxyz| = gradnorm_nc+1 ... </pre></dl> <code>factor</code> is the conversion factor the coordinates have to be multiplied with to convert them to Ångstrøms. <code>nc</code> is a counter for the cycle, <code>E_nc</code> is the energy for the configuration of cycle nc, and <code>gradnorm_nc</code> is the gradient norm of cycle nc. The line starting with <code>unitcell</code> is optional and can be used to specify the current unit cell, e. g. during a constant pressure MD run. Unit cells can be specified either by providing the lengths of the edges and the angles between them or by providing the three vectors which span the unit cell. The <code>x</code>, <code>y</code>, and <code>z</code> are the cartesian coordinates for each atom, <code>symbol</code> is the atomic symbol. The <code>gx</code>, <code>gy</code>, and <code>gz</code> are the gradients for each atom. This structure can be repeated for as many cycles as necessary. </LI> <LI>MO energies and coefficients (optional) <BR> Write to standard output in the following format for closed shell systems: <dl><dd><pre class="verbatim"> $scfmo [symmetrized] [gaussian] n symmetry_label_n eigenvalue=MO_E_n nsaos=norb moc(n,1) moc(n,2) moc(n,3) moc(n,4) moc(n,5) ... moc(n,norb) n+1 symmetry_label_n+1 eigenvalue=MO_E_n+1 nsaos=norb ... </pre></dl> or for open shell systems: <dl><dd><pre class="verbatim"> $uhfmo_alpha [symmetrized] [gaussian] n symmetry_label_n eigenvalue=MO_E_n nsaos=norb moc(n,1) moc(n,2) moc(n,3) moc(n,4) moc(n,5) ... moc(n,norb) n+1 symmetry_label_n+1 eigenvalue=MO_E_n+1 nsaos=norb ... </pre></dl> <dl><dd><pre class="verbatim"> $uhfmo_beta [symmetrized] [gaussian] n symmetry_label_n eigenvalue=MO_E_n nsaos=norb moc(n,1) moc(n,2) moc(n,3) moc(n,4) moc(n,5) ... moc(n,norb) n+1 symmetry_label_n+1 eigenvalue=MO_E_n+1 nsaos=norb ... </pre></dl> The string <code>symmetrized</code> is optional and can be used to notify V<SMALL>IEWMOL</SMALL> of the fact that the MO coefficients are with respect to symmetrized AOs rather than with respect to AOs. V<SMALL>IEWMOL</SMALL> needs moloch from the T<SMALL>URBOMOLE</SMALL> package to handle symmetrized AOs. If moloch is not installed and symmetrized AOs are input, MOs and electron densities cannot be drawn. The string <code>gaussian</code> is also optional and notifies V<SMALL>IEWMOL</SMALL> that the MO coefficients are normalized and ordered G<SMALL>AUSSIAN</SMALL> style. <code>n</code> is a counter counting the MOs, <code>symmetry_label_n</code> is the symmetry label for MO n, <code>MO_E_n</code> is the MO energy for MO n, and <code>norb</code> is the total number of orbitals. The <code>moc(n,i)</code> are the MO coefficients for MO n. </LI> <LI>basis functions and occupation numbers (optional) <BR> Write to standard output in the following format: <dl><dd><pre class="verbatim"> $atoms atom_symbol1 list_of_indices1 \ basis=basis_set_name1 atom_symbol2 list_of_indices2 \ basis=basis_set_name2 ... $basis * basis_set_name1 * number_of_primitives angular_momentum exponent1 coefficient1 exponent2 coefficient2 ... exponentn coefficientn number_of_primitives angular_momentum ... * basis_set_name2 * ... * $closed shells symmetry_label list_of_indices (2) $alpha shells symmetry_label list_of_indices (1) $beta shells symmetry_label list_of_indices (1) $pople [6d/10f/15g] </pre></dl> <code>atom_symbol</code> is the atom symbol of an element and <code>list_of_indices</code> contains the indices of all atoms of the particular element according to the list of coordinates read in under <code>$coord</code>. The list can be either comma separated and/or contain hyphens for indicating ranges (e. g. c 1,3,7-10 is a valid descriptor). <code>Basis_set_name</code> can be an arbitrary string describing a particular basis set. It is only used to find the corresponding basis set in the list read under <code>basis</code>. This list simply states the name for a basis set and then lists the primitive functions which make up a contracted Gaussians starting with the number of primitives in that particular contracted Gaussian and its angular momentum (s, p, d, f, ...). Than the exponents and contraction coefficients are listed line by line. This is repeated for all contracted Gaussians of that particular basis set. <code>$closed shells</code>, <code>$alpha shells</code>, and <code>$beta shells</code> are used to tell V<SMALL>IEWMOL</SMALL> which MOs are occupied with how many electrons. <code>symmetry_label</code> is the symmetry label for a number of MOs and <code>list_of_indices</code> is a list of integers stating which of the MOs of that particular symmetry are occupied by either one or two electron(s). This list can be either comma-separated or contain hyphens to indicate ranges of MOs. <B>Note:</B> <code>$closed shells</code>, <code>$alpha shells</code>, and <code>$beta shells</code> have to appear after <code>$scfmo</code> in the output written by the input filter. <code>$pople</code> is used to indicate that d, f, or g functions have 6, 10, or 15 components instead of 5, 7, or 9. <B>Note:</B> This data group has to appear after the <code>$coord</code> or <code>$grad</code> in the output. Otherwise V<SMALL>IEWMOL</SMALL> will fail. </LI> <LI>grid files <BR> Write to standard output in the following form: <dl><dd><pre class="verbatim"> $grid #n origin x y z vector1 x y z vector2 x y z vector3 x y z grid1 start s delta d points np grid2 start s delta d points np grid3 start s delta d points np type ty title for this grid t plotdata d(1,1,1) d(1,1,2) d(1,1,n) ... d(1,2,1) ... d(1,n,n) d(2,1,1) ... d(n,n,n) </pre></dl> here <code>n</code> is an integer identifying the grid. <code>origin</code> is used to specify the x, y, and z coordinates of the origin of the grid. <code>vector1</code>, <code>vector2</code>, and <code>vector3</code> are used to specify the three vectors spanning the grid. <code>grid1</code>, <code>grid2</code>, and <code>grid3</code> are used to specify the starting point, <code>s</code>, the step size, <code>d</code>, and the number of points, <code>np</code>, on each of the three vectors spanning the grid. <code>ty</code> can be either <code>mo</code> or <code>density</code> specifying whether the data represents a molecular orbital or a density. <code>t</code> is a string giving the grid a title which is used in the wave function dialog to allow the user to select the grid. Finally, <code>d(i,j,k)</code> are the values for the property at each grid point. </LI> <LI>the unit cell (optional) <BR> Write to standard output in one of the following forms: <dl><dd><pre class="verbatim"> $unitcell a b c alpha beta gamma </pre></dl> or <dl><dd><pre class="verbatim"> $unitcell vectors xa ya za xb yb zb xc yc zc </pre></dl> where each row contains the components of one of the three vectors spanning the unit cell (this is also known as the Bravais matrix). </LI> <LI>errors occuring during file processing (optional) <BR> Write to standard output in the following form: <dl><dd><pre class="verbatim"> $error errorLabel severity additionalInformation </pre></dl> <code>errorLabel</code> is an arbitrary one word label which refers to an error message in the resources. <code>severity</code> is a label for the severity of the error. Set it to 0 if the program can continue despite this error. Set it to 1 if the program must stop. <code>additionalInformation</code> is any additional information you want to be displayed in the error message (e. g. the name of a file which was not found). Currently, the following errorLabels are in use: <code>noFile</code>, <code>notConverged</code>, <code>unsupportedVersion</code>, <code>wrongFiletype</code>, <code>noCoordinates</code>, <code>noEnergy</code>, and <code>unknownErrorMessage</code>. If your input filter wants to return an error because it is missing coordinates in the input file ``dummy.inp" you can have it writing the following line to standard output: <dl><dd><pre class="verbatim"> $error missingCoordinates 1 dummy.inp </pre></dl> Then you have to specify a resource for the error message in <BR><code>$HOME/.Xdefaults</code>: <dl><dd><pre class="verbatim"> Viewmol.missingCoordinates: The file %s does not contain any coordinates. </pre></dl> With these two lines in place any encounter of no coordinates in an input file will lead to the display of the error dialog in <A HREF="node32.html#errorExample">the Figure</A>. There is no need to recompile V<SMALL>IEWMOL</SMALL> to achieve this. </LI> </UL> The last line of the data written to standard output by the input filter must be <code>$end</code>. <P> <DIV ALIGN="CENTER"> <TABLE> <A NAME="errorExample"> <CAPTION ALIGN="BOTTOM"><STRONG>Figure 17:</STRONG> The error dialog produced by the sample error message</CAPTION> <TR><TD> <DIV ALIGN="CENTER"> <IMG WIDTH="397" HEIGHT="156" ALIGN="BOTTOM" BORDER="0" SRC="error.png" ALT="\includegraphics[]{error.ps}"> </DIV></TD></TR> </TABLE> </DIV> <P> The input filter can be installed by adding a line to the <code>viewmolrc</code> file. <P> <p><hr> <ADDRESS> <a href="mailto:joehill@users.sourceforge.net"><i>Jörg-Rüdiger Hill</i></a> Fri Oct 31 14:19:21 CET 2003 </ADDRESS> </BODY> </HTML>