percent results from the CPAN

percent
Bio-KBase
view release on metacpan or search on metacpan
lib/Bio/KBase/CDMI/CDMI_APIImpl.pm view on Meta::CPAN



=head2 fids_to_co_occurring_fids

  $return = $obj->fids_to_co_occurring_fids($fids)

=over 4

=item Parameter and return types

=begin html

<pre>
$fids is a fids
$return is a reference to a hash where the key is a fid and the value is a scored_fids
fids is a reference to a list where each element is a fid
fid is a string
scored_fids is a reference to a list where each element is a scored_fid
scored_fid is a reference to a list containing 2 items:
	0: a fid
	1: a float

</pre>

=end html

=begin text

$fids is a fids
$return is a reference to a hash where the key is a fid and the value is a scored_fids
fids is a reference to a list where each element is a fid
fid is a string
scored_fids is a reference to a list where each element is a scored_fid
scored_fid is a reference to a list containing 2 items:
	0: a fid
	1: a float


=end text



=item Description

One of the most powerful clues to function relates to conserved clusters of genes on
the chromosome (in prokaryotic genomes).  We have attempted to record pairs of genes
that tend to occur close to one another on the chromosome.  To meaningfully do this,
we need to construct similarity-based mappings between genes in distinct genomes.
We have constructed such mappings for many (but not all) genomes maintained in the
Kbase CS.  The prokaryotic geneomes in the CS are grouped into OTUs by ribosomal
RNA (genomes within a single OTU have SSU rRNA that is greater than 97% identical).
If two genes occur close to one another (i.e., corresponding genes occur close
to one another), then we assign a score, which is the number of distinct OTUs
in which such clustering is detected.  This allows one to normalize for situations
in which hundreds of corresponding genes are detected, but they all come from
very closely related genomes.

The significance of the score relates to the number of genomes in the database.
We recommend that you take the time to look at a set of scored pairs and determine
approximately what percentage appear to be actually related for a few cutoff values.

=back

=cut

sub fids_to_co_occurring_fids
{
    my $self = shift;
    my($fids) = @_;

    my @_bad_arguments;
    (ref($fids) eq 'ARRAY') or push(@_bad_arguments, "Invalid type for argument \"fids\" (value was \"$fids\")");
    if (@_bad_arguments) {
	my $msg = "Invalid arguments passed to fids_to_co_occurring_fids:\n" . join("", map { "\t$_\n" } @_bad_arguments);
	Bio::KBase::Exceptions::ArgumentValidationError->throw(error => $msg,
							       method_name => 'fids_to_co_occurring_fids');
    }

    my $ctx = $Bio::KBase::CDMI::Service::CallContext;
    my($return);
    #BEGIN fids_to_co_occurring_fids
    my $kb = $self->{db};
    $return = {};
    for my $id (@$fids) {
        my @resultRows = $kb->GetAll("IsInPair Pairing Determines PairSet",
                                      "IsInPair(from_link) = ?", [$id],
				     [qw(IsInPair(to_link) PairSet(score))]);


	if (@resultRows != 0) {
		my @scoredFids;
		for my $resultRow (@resultRows) {
		    my ($pair, $score) = @$resultRow;
		    my ($fid) = grep { $_ ne $id } split /:/, $pair;
		    push @scoredFids, [$fid, $score];
		}
		$return->{$id} = \@scoredFids;
        }
    }

    #END fids_to_co_occurring_fids
    my @_bad_returns;
    (ref($return) eq 'HASH') or push(@_bad_returns, "Invalid type for return variable \"return\" (value was \"$return\")");
    if (@_bad_returns) {
	my $msg = "Invalid returns passed to fids_to_co_occurring_fids:\n" . join("", map { "\t$_\n" } @_bad_returns);
	Bio::KBase::Exceptions::ArgumentValidationError->throw(error => $msg,
							       method_name => 'fids_to_co_occurring_fids');
    }
    return($return);
}




=head2 fids_to_locations

  $return = $obj->fids_to_locations($fids)

=over 4
( run in 1.017 second using v1.01-cache-2.11-cpan-71847e10f99 )