Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 3d0d0177db421ffde0b64948d214366a > files > 104

polyxmass-doc-0.9.0-1mdv2007.0.noarch.rpm

\chapter{Preface}

\label{chap:preface} 

This manual is about the \pxm\ mass spectrometric software suite, a
computing framework that aims at predicting/analyzing mass
spectrometric data on (bio)polymers. As such, this manual is intended
for people willing to learn how to install and use this multi-modular
software suite.

Mass spectrometry has gained popularity across the past five years or
so. Indeed, developments in polymer mass spectrometry have made this
technique appropriate to accurately measure masses of polymers as
heavy as many hundreds of kDa. 

There are a number of utilities --sold by mass spectrometer
constructors with their machines, usually as a marketing ``plus''--
that allow predicting/analyzing mass spectrometric data of
polymers. These programs are usually different from a constructor to
another. Also, there are as many mass spectrometric data
prediction/analysis computer programs as there are different polymer
types. You will get a program for oligonucleotides, another one for
proteins, maybe there is one program for saccharides, and so on. Thus,
the biochemist/massist, for example, who happens to work on different
biopolymer types will have to learn the use of a number of different
software packages. Also, if the software user does not own a mass
spectrometer, chances are he will need to buy all these software
packages. 

The \pxm\ mass spectrometric computing framework is designed to
provide \emph{free} solutions to all these problems. And it does this
by:
\begin{itemize} 
\item Allowing \textit{ex nihilo} polymer chemistry definitions (in
  the \pxd module that sits in the \pxm program);
\item Allowing simple yet powerful mass computations to be made in a
  mass desktop calculator that is both polymer chemistry
  definition-aware and fully programmable (that's the \pxc\ module
  also sitting in the \pxm program);
\item Allowing highly sophisticated editing of polymer sequences on a
  polymer chemistry definition-specific basis, along with chemical
  reaction simulations, finely configured mass spectrometric
  computations\dots (all taking place in the \pxe module that is the
  main module of the \pxm program);
\item Allowing customization of the way each monomer will show up
  graphically during the program operation (in the \pxe module);
\item Allowing polymer sequence editing with immediate visualization
  of the mass changes elicited by the editing activity (in the \pxe 
  module);
\item Unlimited number of polymer sequences opened at any given time
  and of any given polymer chemistry definition type (in the \pxe 
  module).
\end{itemize} 

This manual will progressively introduce all these functionalities in
a timely and clear fashion.

\renewcommand{\sectitle}{\OSname{UNIX} and \LinuX\ Histories}
\section*{\sectitle}
\addcontentsline{toc}{section}{\numberline{}\sectitle}

Thanks to the \acronym{GNU} Free Documentation License, I borrowed
(and cosmetically modified it) the material in this section from a
remarkable document by David A. Wheeler: \textit{Secure Programming
  for \LinuX\ and \OSname{UNIX} HOWTO}. \footnote{Get this paper and
  others at \url{http://www.dwheeler.com}} I think that it is
important to provide some background to the choice of a development
platform when the time comes to document the software that one has
taken so much time to code\dots

\subsection*{\OSname{UNIX}} 

In 1969-1970, Kenneth Thompson, Dennis Ritchie, and others at
\corpname{AT\&T Bell Labs} began developing a small operating system
on a little-used \OSname{PDP-7}. The operating system was soon
christened \OSname{UNIX}, a pun on an earlier operating system project
called \OSname{MULTICS}. In 1972-1973 the system was rewritten in the
programming language C, an unusual step that was visionary: due to
this decision, \OSname{UNIX} was the first widely-used operating
system that could switch from and outlive its original hardware. Other
innovations were added to \OSname{UNIX} as well, in part due to
synergies between \corpname{Bell Labs} and the academic community. In
1979, the ``seventh edition'' (V7) version of \OSname{UNIX} was
released, the grandfather of all extant \OSname{UNIX} systems.

After this point, the history of \OSname{UNIX} becomes somewhat
convoluted. The academic community, led by Berkeley, developed a
variant called the Berkeley Software Distribution (\OSname{BSD}),
while \corpname{AT\&T} continued developing \OSname{UNIX} under the
names ``\OSname{System III}'' and later ``\OSname{System V}''. In the
late 1980's through early 1990's the ``wars'' between these two major
strains raged. After many years each variant adopted many of the key
features of the other. Commercially, \OSname{System V} won the
``standards wars'' (getting most of its interfaces into the formal
standards), and most hardware vendors switched to \corpname{AT\&T}'s
\OSname{System V}. However, \OSname{System V} ended up incorporating
many \OSname{BSD} innovations, so the resulting system was more a
merger of the two branches. The \OSname{BSD} branch did not die, but
instead became widely used for research, for PC hardware, and for
single-purpose servers (e.g., many web sites use a \OSname{BSD}
derivative). 

The result was many different versions of \OSname{UNIX}, all based on
the original seventh edition. Most versions of \OSname{UNIX} were
proprietary and maintained by their respective hardware vendor, for
example, \corpname{Sun} \OSname{Solaris} is a variant of
\OSname{System V}. Three versions of the \OSname{BSD} branch of
\OSname{UNIX} ended up as open source: \OSname{FreeBSD} (concentrating
on ease-of-installation for PC-type hardware), \OSname{NetBSD}
(concentrating on many different CPU architectures), and a variant of
\OSname{NetBSD}, \OSname{OpenBSD} (concentrating on security). More
general information about \OSname{UNIX} history can be found at
\url{http://www.levenez.com/unix/}.

\subsection*{Free Software Foundation} 

In 1984 Richard Stallman's \corpname{Free Software Foundation}
(\acronym{FSF}) began the \acronym{GNU} project, a project to create a
free version of the \OSname{UNIX} operating system. By free, Stallman
meant software that could be freely used, read, modified, and
redistributed. The \acronym{FSF} successfully built a vast number of
useful components, including the \software{GNU compiler collection}
(\software{gcc}), an impressive text editor (\software{GNU Emacs}),
and a host of fundamental tools. However, in the 1990's the
\acronym{FSF} was having trouble developing the operating system
kernel; without a kernel the rest of their software would not work.

\subsection*{\LinuX} 

In 1991 Linus Torvalds began developing an operating system kernel,
which he named ``Linux''. This kernel could be combined with the
\acronym{FSF} material and other components (in particular some of the
\OSname{BSD} components and Massachusetts Institute of Technology's
(\acronym{MIT}) \software{X Window} software) to produce a
freely-modifiable and very useful operating system. This book will
term the kernel itself the ``Linux'' kernel and an entire combination
as ``\LinuX''.

In the \LinuX\ community, different organizations have combined the
available components differently. Each combination is called a
``distribution'', and the organizations that develop distributions are
called ``distributors''. Common distributions include \corpname{Red
  Hat}, \corpname{Mandrake}, \corpname{SuSE} and
\corpname{Debian}. There are differences between the various
distributions, but all distributions are based on the same foundation:
the Linux kernel and the \acronym{GNU} \libname{glibc}
libraries. Since both are covered by ``copyleft'' style licenses,
changes to these foundations generally must be made available to all,
a unifying force between the \LinuX\ distributions at their foundation
that does not exist between the \OSname{BSD} and
\corpname{AT\&T}-derived \OSname{UNIX} systems. 

\subsection*{Open Source vs Free Software} 

Increased interest in software that is freely shared has made it
increasingly necessary to define and explain it. A widely used term is
``open source software''. Eric Raymond wrote several seminal articles
examining its various development processes. Another widely-used term
is ``free software'', where the ``free'' is short for ``freedom'': the
usual explanation is ``free speech, not free beer''. Neither phrase is
perfect. The term ``free software'' is often confused with programs
whose executables are given away at no charge, but whose source code
cannot be viewed, modified, or redistributed. Conversely, the term
``open source'' is sometimes (ab)used to mean software whose source
code is visible, but for which there are limitations on use,
modification, or redistribution. This book uses the term ``open
source'' for its usual meaning, that is, software which has its source
code freely available for use, viewing, modification, and
redistribution; a more detailed definition is contained in the Open
Source Definition. Information on the definition of free software, and
the motivations behind it, can be found at \url{http://www.fsf.org}.

Those interested in reading advocacy pieces for open source software
and free software should see \url{http://www.opensource.org} and
\url{http://www.fsf.org}. There are other documents in the internet
which examine such software, for example, authors have found that the
open source software were noticeably more reliable than proprietary
software (using their measurement technique, which measured resistance
to crashing due to random input). 

\renewcommand{\sectitle}{Typographical conventions}
\section*{\sectitle}
\addcontentsline{toc}{section}{\numberline{}\sectitle}


Throughout the book the following typographical
conventions are used:
\begin{itemize} 
\item \emph{emphasized text} {\footnotesize is used each time a new
    term or concept is introduced} 
\item \prompt\ {\footnotesize shows the prompt at which a command
    should be entered as non-root} 
\item \promptsu\ {\footnotesize shows the prompt at which a command
    should be entered as root} 
\item \command{this typography} {\footnotesize applies to commands
    that the user enters at the shell prompt along with eventual options} 
\item \kbdEnterKey\ symbolizes pressing the \emph{Enter} key. 
\item \verb#this typography# {\footnotesize applies to an output
    resulting from entering a command at the shell prompt} 
\item \progname{emacs} or \libname{libglib} {\footnotesize names of a
    program or of a library} 
\item \software{GNOME}, \software{The Gimp} {\footnotesize is the name
    of a generic software (not a specific executable file)} 
\item \filename{/usr/local/share}, \filename{/usr/bin/polyxmass}
  {\footnotesize are names of a directory or of a file} 
\item \url{http://www.gnu.org} {\footnotesize is a URL (Uniform
    Resource Locator)} 
\end{itemize}

\renewcommand{\sectitle}{Program Availability, Technicalities}
\section*{\sectitle}
\addcontentsline{toc}{section}{\numberline{}\sectitle}

\label{sect:prog-availability-technicalities}

\pxm\ has been initially developed on a \LinuX\ system
(\corpname{RedHat} distribution versions successively 6.0, 7.0, 7.2,
7.3, 8.0, 9.0) using software from the \corpname{Free Software
  Foundation} (\acronym{FSF}\footnote{For an in-depth coverage of the
  philosophy behind the \acronym{FSF}, specifically creating a
  \emph{free operating system}, you might desire to visit
  \url{http://www.gnu.org}}). Since mid-2002, the development is
performed on a \DebianGNUL system (\url{http://www.debian.org}) which I
find the ultimate highly-configurable easy-to-use distribution on
earth.

Developing for \LinuX\ has been utterly exciting and extremely
efficient. My warm thanks do go to all the persons who have engaged
themselves (energy and time) in \emph{Free Software/true Open Source}
by coding, documenting, reviewing\dots\ software. The development was
mainly centered around the following programs and utilities:

\begin{itemize}
\item \acronym{GNU} software is central to my developing system: 
  \begin{itemize}
  \item \software{GNU Emacs}, a text editor that is an environment
  \textit{per se} 
  \item \software{Autotools}, an integrated set of programs to make
    software development easy and portable. Includes
    \software{Autoconf}, \software{Automake} and others\dots\\ 
    (\url{http://www.gnu.org}, home of the \emph{Free Software
    Movement}); 
  \item \software{GDK/GTK+}, two libraries for windowing in the X
    Window graphic environment\\ 
    (\url{http://www.gtk.org});
  \item \software{The Gimp}, a wonderful program for doing graphical
    illustrations in pixel mode (raster images). Think of it as an
    excellent free replacement for the \software{Photoshop}
    program. The ``icons'' representing each single monomer in the
    sequence editor were made using \software{The Gimp}. It saves in
    \fileformat{xpm}, \fileformat{png}, \fileformat{jpg} and many
    other graphic formats\\ 
    (\url{http://www.gimp.org}); 
  \item \software{GNOME}, a graphical environment for the \LinuX\
    desktop. I used the \software{GNOME} canvas widget to tailor the
    sequence editor\\ 
    (\url{http://www.gnome.org}); 
  \end{itemize}

\item Thomas Esser has made a \TeX/\LaTeX\ environment of exceptional
  quality. I used it everyday, and typeset this manual using it. Of
  course, Prof. Donald Knuth is the grand-daddy of all this, having
  invented \TeX\ and Leslie Lamport is the father of \LaTeX!\\
  (\url{http://www.tug.org}; search for \software{teTeX});
\item \software{Glade} is a wonderful graphical interface builder (by
  Damon Chaplin) that I used to design the graphical interface of the
  program. I used it in conjunction with the \libname{libglade}
  library (by James Henstridge)\\
  (\url{http://glade.gnome.org} and\\
  \url{http://www.daa.com.au/~james/software/libglade});
\item \corpname{RedHat} is undoubtedly committed to the success of the
  \emph{Free Software Movement} and happens to be the maker of a
  popular (my) \LinuX\ distribution\\ 
  (\url{http://www.redhat.com}); 
\item Bernhard Herzog has written a vector drawing package that I used
  for some illustrations in the \pxm\ package. It is called
  \software{Sketch}\\ 
  (\url{http://sketch.sourceforge.net}); 
\item Lauris Kaplinski and co-workers have crafted a very powerful
  program to create and handle scalar vector graphics. This program is
  called \software{Sodipodi}\\
  (\url{http://sodipodi.sourceforge.net});
\item Owen Taylor has written a memory profiling tool that I used
  ---during the RedHat\LinuX-based development--- to
  detect memory leaks. It is called \software{memprof}\\
  (\url{otaylor{@}redhat.com}, remove the curly brackets);
\item Of course I do forget many software packages that I used for
  this work. Thanks to their authors and to their maintainers: without
  their hard work my \LinuX\ box would not exist!
\end{itemize}
    
\renewcommand{\sectitle}{Organization Of This Manual}
\section*{\sectitle}
\addcontentsline{toc}{section}{\numberline{}\sectitle}

After having rapidly explained the general pattern about
installing each of the modules that make the \pxm\ software suite,
this manual aims at providing the required concept toolset for
understanding what to expect from a computer program project like
\pxm. Thus, the general organization of this book is:
\begin{itemize} 
\item Installation of the \pxm modules; 
\item The basics of polymer chemistry;
\item The basics of mass spectrometry;
\item Generalities about the \pxm software;
\item The \pxd chapter (definition of atoms and of new polymer
  chemistries);
\item The \pxc chapter (polymer chemistry-aware programmable
  calculator);
\item The \pxe chapter (sequence editor, biochemical/mass
  spectrometric simulations);
\item The \pxmcommon chapter describing the fundamental
  configuration/data files that are required to run the \pxm software;
\item The \pxmdata chapter describing the \pxm' complex chemical
  configuration hierarchy;
\item Appendices.
\end{itemize} 

\renewcommand{\sectitle}{\pxm' Licensing Philosophy}
\section*{\sectitle}
\addcontentsline{toc}{section}{\numberline{}\sectitle}

The front matter of this manual contains a Copyright
statement. I wish to retain the copyright to \pxm\ and all related
writings (source and configuration files, programmer's documentation,
user manual\dots) However, I do not deny others the right to make
copies of the work, to redistribute it freely, to modify it according
to the \acronym{GNU} General Public License for the \pxm\ computer
program, and according to the \acronym{GNU} Free Documentation
License.

The aim of this licensing is to favor spread of knowledge to the
widest public possible. Also, it encourages interested
hackers\footnote{\emph{Hacker} is a specialized term to design the
  programmer who codes programs; this term should \emph{not} be
  mistaken with \emph{cracker} who is a person who uses computer
  science knowledge to break information systems' security barriers.}
to change the code, to improve it and to send patches to the author so
that their improvements get in the program to the benefit of the
widest public possible. For an in-depth study of the \emph{free
  software} philosphy I kindly urge the reader to visit
\url{http://www.gnu.org/philosophy}.
 
\renewcommand{\sectitle}{Contacting The Author}
\section*{\sectitle}
\addcontentsline{toc}{section}{\numberline{}\sectitle}

\pxm\ program is the fruit of months of work on my part. While I've
put a lot of energy into making this program as stable and reliable a
piece of software as possible, \pxm\ comes with no warranty of any
kind. I hope that \pxm\ will help numerous researchers with their mass
spectrometric data prediction/analysis work, which will hopefully ease
the creation of \emph{scientific knowledge}.

The general policy for directing questions, comments, feature
requests, \pxm\ program and/or \pxm\ documentation bug reports should
be self-explanatory by looking at the addresses below:
\begin {center}
  \includegraphics [scale=0.5]
  {figures/raster/polyxmass-all-mail-adresses-300x130.png}
\end {center}

\vfill
\clearpage

\noindent To direct any comment(s) to the author through snail mail,
use the following address:\\ 

\begin{center}
\noindent D$\mathrm{^r}$~Filippo \textsc{Rusconi}\\
\vspace{2mm}%
Chargé de recherches au CNRS\\
\textsc{Centre national de} \\
\textsc{la Recherche scientifique} \vspace{2mm} \\
UMR CNRS 5153 - UR INSERM 565 - USM MNHN 0503\\
Muséum national d'Histoire naturelle\\
43, rue Cuvier \\
F-75231 Paris \textsc{Cedex} 05 \\
France
\end{center}

\cleardoublepage