origin results from the CPAN

origin

Algorithm-Cluster

view release on metacpan or search on metacpan

	Copyright Holder, but only to the computing community at large
	as a market that must bear the fee.)

	"Freely Available" means that no fee is charged for the item
	itself, though there may be fees involved in handling the item.
	It also means that recipients of the item may redistribute it
	under the same conditions they received it.

1. You may make and give away verbatim copies of the source form of the
Standard Version of this Package without restriction, provided that you
duplicate all of the original copyright notices and associated disclaimers.

2. You may apply bug fixes, portability fixes and other modifications
derived from the Public Domain or from the Copyright Holder.  A Package
modified in such a way shall still be considered the Standard Version.

3. You may otherwise modify your copy of this Package in any way, provided
that you insert a prominent notice in each changed file stating how and
when you changed that file, and provided that you do at least ONE of the
following:

perl/Cluster.pm view on Meta::CPAN


=head1 DESCRIPTION

This module is an interface to the C Clustering Library,
a general purpose library implementing functions for hierarchical 
clustering (pairwise simple, complete, average, and centroid linkage), 
along with k-means and k-medians clustering, and 2D self-organizing 
maps.  This library was developed at the Human Genome Center of the
University of Tokyo. The C Clustering Library is distributed along 
with Cluster 3.0, an enhanced version of the famous 
Cluster program originally written by Michael Eisen 
while at Stanford University.

=head1 EXAMPLES

See the scripts in the examples subdirectory of the package.

=head1 CHANGES

=over 4

src/cluster.c view on Meta::CPAN

 *    correct singular values should be correct.
 *
 *
 * Questions and comments should be directed to B. S. Garbow,
 * Applied Mathematics division, Argonne National Laboratory
 *
 * Modified to eliminate machep
 *
 * Translated to C by Michiel de Hoon, Human Genome Center,
 * University of Tokyo, for inclusion in the C Clustering Library.
 * This routine is less general than the original svd routine, as
 * it focuses on the singular value decomposition as needed for
 * clustering. In particular,
 *  - We calculate both u and v in all cases
 *  - We pass the input array A via u; this array is subsequently overwritten.
 *  - We allocate for the array rv1, used as a working space, internally in
 *    this routine, instead of passing it as an argument.
 *    If the allocation fails, svd returns -1.
 * 2003.06.05
 */
{

src/cluster.c view on Meta::CPAN


Arguments
=========

nelements    (input) int
The number of elements that were clustered.

tree         (input) Node[nelements-1]
The clustering solution. Each node in the array describes one linking event,
with tree[i].left and tree[i].right representing the elements that were joined.
The original elements are numbered 0..nelements-1, nodes are numbered
-1..-(nelements-1).

nclusters    (input) int
The number of clusters to be formed.

clusterid    (output) int[nelements]
The number of the cluster to which each element was assigned. Clusters are
numbered 0..nclusters-1 in the left-to-right order in which they appear in the
hierarchical clustering tree. Space for the clusterid array should be allocated
before calling the cuttree routine.

src/cluster.c view on Meta::CPAN

dist == 's': Spearman's rank correlation
dist == 'k': Kendall's tau
For other values of dist, the default (Euclidean distance) is used.

distmatrix (input) double**
The distance matrix. If the distance matrix is passed by the calling routine
treecluster, it is used by pslcluster to speed up the clustering calculation.
The pslcluster routine does not modify the contents of distmatrix, and does
not deallocate it. If distmatrix is NULL, the pairwise distances are calculated
by the pslcluster routine from the gene expression data (the data and mask
arrays) and stored in temporary arrays. If distmatrix is passed, the original
gene expression data (specified by the data and mask arguments) are not needed
and are therefore ignored.


Return value
============

A pointer to a newly allocated array of Node structs, describing the
hierarchical clustering solution consisting of nelements-1 nodes. Depending
on whether genes (rows) or samples (columns) were clustered, nelements is

src/cluster.h view on Meta::CPAN

void kmedoids(int nclusters, int nelements, double** distance,
  int npass, int clusterid[], double* error, int* ifound);

/* Chapter 4 */
typedef struct {int left; int right; double distance;} Node;
/*
 * A Node struct describes a single node in a tree created by hierarchical
 * clustering. The tree can be represented by an array of n Node structs,
 * where n is the number of elements minus one. The integers left and right
 * in each Node struct refer to the two elements or subnodes that are joined
 * in this node. The original elements are numbered 0..nelements-1, and the
 * nodes -1..-(nelements-1). For each node, distance contains the distance
 * between the two subnodes that were joined.
 */

Node* treecluster(int nrows, int ncolumns, double** data, int** mask,
  double weight[], int transpose, char dist, char method, double** distmatrix);
int sorttree(const int nnodes, Node* tree, const double order[], int indices[]);
int cuttree(int nelements, const Node* tree, int nclusters, int clusterid[]);

/* Chapter 5 */

( run in 0.247 second using v1.01-cache-2.11-cpan-1c8d708658b )