Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > ed4950ee216151219bf3841700ccc7f8 > files > 29

gp-0.26-6mdv2010.0.i586.rpm








<html><head><link rev="made" href="mailto:january@bioinformatics.org">
</head>
<body bgcolor="#FFFFFF" link="#FF0000">

<hr>

<h2>GP</h2>
<h2>2000</h2>


    
<h2>NAME</h2>
    gp_acc - Computate the auto-cross correlation values for a sequence
<p><h2>SYNOPSIS</h2>
    
    <strong>gp_acc</strong> [-e ] [-l value] [ -p value] [-q] [-v] [-d] [-h] [inputfile] [outputfile]
<p><h2>OPTIONS</h2>
    
<p><dl>
<p><p></p><dt><strong>-e</strong><dd> Only encode the sequence, do not computate the ACC.
<p><p></p><dt><strong>-l value</strong><dd> precede the output with a header containint the descriptions
		of variables, assuming a sequence length <strong>value</strong>.
<p><p></p><dt><strong>-p value</strong><dd> maximal lag will be <strong>value</strong>. If this option is not used, the
		program sets the maximal lag to 1/3 of the length of the current sequence.
<p><p></p><dt><strong>-v</strong><dd> Prints the version information.
<p><p></p><dt><strong>-d</strong><dd> Prints lots of debugging information.
<p><p></p><dt><strong>-h</strong><dd> Shows usage information.
<p><p></p><dt><strong>inputfile</strong><dd> file to proces; if not given, will use standard input
<p><p></p><dt><strong>outputfile</strong><dd> file to write the data to; if not given, will
    use standard output
<p></dl>
<p><h2>DESCRIPTION</h2>
    
<p>Note: currently only DNA/RNA sequences are supported.
<p>Auto-cross correlation (ACC) is a way of converting a sequence into a set of
	variables which contain information useful for, for example, statistical
	analysis. In ACC, the sequence is encoded into a numerical values. Currently,
	this encoding assignes for each nucleotide three values (-1,-1,1 for A,
	1,-1.-1 for C, -1,1,-1 for G, and 1,1,1 for T/U) of the three so-called
	descriptor variables. Next, the sequence is alligned with itself whith a lag
	(shift) equal to one, and covariance coefficients are computed for each pair
	of the three descriptors variables, together nine coefficients. Then the lag
	is increased by one and the procedure is repeated until the lag reaches the
	maximal lag value (being 1/3 of the sequence automatically, or optional value
	choosen by the user). 
<p>For each sequence computated, a row of data is produces, containing <strong> (maximal
	lag - 1) * number of descriptor variables * number of descriptor variables</strong> values.
	For example, for a nucleotide sequence with three descriptor variables (as
	described above) and a maximal lag of 20, you will get 171 values.
<p>This doesn't probably make much sense to you unless you are familiar with such
	terms as PCA and PLS. You can find more information on this subject in S. Wold
	and M. Sj&ouml;str&ouml;m, 1998, "Chemometrics, present and future success",
	Chemometrics and Intelligent Laboratory Systems 44:3-14, and, by the same
	authors, 1985, "A multivariate study of the relationship between the genetic
	code and the physical-chemical properties of amino-acids", J. Mol. Evol.
	22:272-7.
<p><h2>SEE ALSO</h2>
    
<a href="index.html">Genpak(1)</a> 
<a href="gp_adjust.html">gp_adjust(1)</a> 
<a href="gp_cdndev.html">gp_cdndev(1)</a> 
<a href="gp_cusage.html">gp_cusage(1)</a> 
<a href="gp_digest.html">gp_digest(1)</a> 
<a href="gp_dimer.html">gp_dimer(1)</a> 
<a href="gp_findorf.html">gp_findorf(1)</a> 
<a href="gp_gc.html">gp_gc(1)</a> 
<a href="gp_getseq.html">gp_getseq(1)</a> 
<a href="gp_map.html">gp_map(1)</a> 
<a href="gp_matrix.html">gp_matrix(1)</a> 
<a href="gp_mkmtx.html">gp_mkmtx(1)</a> 
<a href="gp_pattern.html">gp_pattern(1)</a> 
<a href="gp_primer.html">gp_primer(1)</a> 
<a href="gp_qs.html">gp_qs(1)</a> 
<a href="gp_randseq.html">gp_randseq(1)</a> 
<a href="gp_seq2prot.html">gp_seq2prot(1)</a> 
<a href="gp_slen.html">gp_slen(1)</a> 
<a href="gp_tm.html">gp_tm(1)</a> 
<a href="gp_trimer.html">gp_trimer(1)</a> 
<p><h2>DIAGNOSTICS</h2>
    
<p>All <strong>Genpak</strong> programs complain in situations you would also complain,
like when they cannot find a sequence you gave them or the sequence is not
valid. 
<p>The <strong>Genpak</strong> programs do not write over existing files. I have found this
feature very useful :-)
<p><h2>BUGS</h2>
    
<p>I'm sure there are plenty left, so please mail me if you find them. I tried
to clean up every bug I could find.
<p><h2>AUTHOR</h2>
    
<p>January Weiner III
		<a href="mailto:january@bioinformatics.org">&lt;january@bioinformatics.org&gt;</a>    
</body>
</html>