List-Vectorize

 view release on metacpan or  search on metacpan

lib/List/Vectorize.pm  view on Meta::CPAN

=item C<sign($value)>

return the sign of a value (1|0|-1)

=item C<sum(ARRAY_REF)>

Summmation of a list of numbers.

=item C<mean(ARRAY_REF)>

Mean value of a list of numbers.

=item C<geometric_mean(ARRAY_REF)>

Geometric mean value of a list of numbers.

=item C<sd(ARRAY_REF, SCALAR)>

Standard deviation of a list of numbers. The second argument is the mean value (optional)

=item C<var(ARRAY_REF, SCALAR)>

Variance of a list of numbers. The second argument is the mean value (optional)

=item C<cov(ARRAY_REF, ARRAY_REF)>

Coviarance of two vectors.

=item C<cor(ARRAY_REF, ARRAY_REF, SCALAR)>

Correlation coefficient of two vectors. The third argument is "pearson" (by default) or "spearman".

=item C<dist(ARRAY_REF, ARRAY_REF, SCALAR)>

Distance between two vectors. Several definition of the distance are provided.

  euclidean  Euclidean distance (by default)
  person     Person correlation coefficient
  spearman   Spearman correlation coefficient
  logical    It is defined as 1/(1+k) where k is the number of items that are both ture in two vectors.

=item C<freq(ARRAY_REF, ARRAY_REF, ...)>

Frequency of the items in an array or arrays. Returns a hash reference. Different catelogical strings
are seperated by "|".

  my $a = ["a", "a", "a", "a", "b", "b", "b", "b"];
  my $b = ["1", "2", "1", "2", "1", "2", "1", "2"];
  print_ref freq($a);
  print_ref freq($a, $b);

=item C<table(ARRAY_REF, ARRAY_REF, ...)>

The same as C<freq>, to be consist with R

=item C<scale(ARRAY_REF, SCALAR)>

Scale the vector based on some criterion.

  zvalue        vector has mean value of 0 and variance of 1 (by default)
                formula: (x-mean)/sd
  percentage    values in the vector are between 0 - 1
                formula: (x-min)/(max-min)
  sphere        format the n-dimensional point on the surface of the unit super sphere
                formula: x/radius

=item C<sample(ARRAY_REF, SCALAR (size), HASH)>

Random samplings and permutations. The third argument is
  
  p         probability for each sampling, values will be scaled into [0, 1]
  replace   whether sampling with replacement. 1|0

  my $x = ["a".."g"];
  # sample without replacement
  sample($x, 5);
  # permutation
  sample($x, len($x));
  # sample with replacement
  sample($x, 5, "replace" => 1);
  # sample with unequal probability
  # normalization of the p-values will be done automatically
  sample($x, 5, "p" => [10, 1, 1, 1, 1, 1, 1]);

=item C<rnorm(SCALAR (size), SCALAR (mean), SCALAR (sd))>

Generate random numbers from normal distribution. Default mean value is 0 and default
standard deviation is 1.

  my $x = rnorm(10);
  $x = rnorm(10, 1, 2);

=item C<rbinom(SCALAR (size), SCALAR (p-value for success))>

Generate random numbers from binominal distribution. P-value is 0.5 by default.

  my $x = rbinom(10, 0.1)

=item C<max(ARRAY_REF)>

Maximum value in a vector

=item C<min(ARRAY_REF)>

Minimum value in a vector

=item C<which_max(ARRAY_REF)>

Find the index of the maximum value in the array. If there are several maximum values,
only the take the first one.

=item C<which_min(ARRAY_REF)>

Find the index of the minimum value in the array. If there are several minimum values,
only the take the first one.

=item C<median(ARRAY_REF)>

Median value in a vector.

=item C<quantile(ARRAY_REF, ARRAY_REF | SCALAR_REF )>

quantile, the second argument can be a single percentage or a list of percentages storted in an array reference.
The return value type is same as the second argument. If the second argumet is not
specified, it will take [0, 0.25, 0.5, 0.75, 1].



( run in 1.346 second using v1.01-cache-2.11-cpan-df04353d9ac )