cdata results from the CPAN

cdata
Wurst
view release on metacpan or search on metacpan
=item sub_mat_string SUB_MAT

Return a string containing the S<substitution / score> matrix
held in SUB_MAT.

=item sub_mat_read (FILENAME)

Go to FILENAME. Read up the substitution / score matrix and
return it.

=item sub_mat_shift SUBST_MATRIX, BOTTOM

Given a substitution matrix (an object of type Sub_mat), shift
the whole matrix so the smallest (most negative value) is of
size BOTTOM.  This does not return anything. It acts on the
SUBST_MATRIX argument directly.

=item sub_mat_scale SUBST_MATRIX, BOTTOM, TOP

Given a substition matrix, SUBST_MATRIX, scale and shift it so
the minimum and maximum values run from BOTTOM to TOP.

=item score_mat_sum_smpl NEW_MAT SCORE_MAT PGAP_OPEN PGAP_WIDEN QGAP_OPEN QGAP_WIDEN ALIGNMENT_TYPE

We have a score matrix which could be from sequence/sequence,
sequence/structure or whatever. Now, do the dynamic
programming work. Sum the score matrix and return a set of
pairs.

The parameters are

=over

=item NEW_MAT

This is a fresh matrix with the traced back scores in it.

=item SCORE_MAT

This is the score matrix.

=item GAP_OPEN

=item GAP_WIDEN

=item ALIGNMENT_TYPE

There are only two values allowed, either

  $N_AND_W

or

  $S_AND_W

These stand for "Needleman and Wunsch" and "Smith and
Waterman" respectively.  Any other value will cause an error.

=back

=item svm_rs_cdata MODEL NATIVE SCOR_SET RS_PARAM CVTYPE

*EXPERIMENTAL!*

The function returns an array of training vectors suitable
for use in training a support vector machine (libSVM.pm) or
some other machine learning procedure. Its form is :

  [ [label_class, [(feature vector)], .. ]

The scheme for calculating the training vectory is given
in CVTYPE, and the data is formed from the local sequence
to structure scores as given by SCOR_SET, the TANH forcefield
based pairwise interaction terms (calculated via RS_PARAM),
and the local model consistency (based on the difference of
distance matrices computed between MODEL and NATIVE).

Scheme 0 works as follows :
(see scoranlys.c:get_svmdata for details at the moment).

=item svm_rsfeat MODEL SCOR_SET RS_PARAMS CVTYPE

This returns a set of feature vectors for each position in
MODEL, calculated from local sequence-structure fitness and
residue-specific interaction terms according to the CVTYPE
scheme (see svm_rs_cdata or scoranlys.c for details).  The
form is as follows :

  my @m_fvset = svm_rsfeat MODEL, SCOR_SET, PARAMS, 0
  @m_fvset is of form
    [ (undef), [feature vector], .., .., (undef)]
  and (scalar @m_fvset) == coord_size(MODEL)

undefs are given for positions in the model where a full
feature vector cannot be computed (at the ends, for instance).


=back

=head1 BUILD AND INSTALL

Wurst is mostly migrated to the standard perl build system. This
means you should go the top level directory and try

   perl Makefile.PL
   make
   make install

This would try to install into the system directories, so a
better tactic might be

   perl Makefile PREFIX=~/myperl LIB=~/myperl/lib
   make

This usually builds without problem. Check with scripts from the
F<t> directory. Then install into your own account with

  make install

Your scripts would then have to start with lines like

   use lib "$ENV{HOME}/myperl/lib

If this looks OK, you might

  cd scripts
  perl hello.pl

This will check if Wurst pieces appear to be in place. If that
looks OK, then edit F<wurst/src/Wurst/Makefile.PL> to set the
installation destination.  Then

  make install

from the top level directory. Then go back to the scripts
directory and try a different file like

  cd scripts
  perl hello2.pl

=head1 FILE FORMATS

=over

=item PHD secondary structure files
( run in 0.914 second using v1.01-cache-2.11-cpan-2398b32b56e )