AI-TensorFlow-Libtensorflow

 view release on metacpan or  search on metacpan

lib/AI/TensorFlow/Libtensorflow/Manual/Notebook/InferenceUsingTFHubEnformerGeneExprPredModel.pod  view on Meta::CPAN

# PODNAME: AI::TensorFlow::Libtensorflow::Manual::Notebook::InferenceUsingTFHubEnformerGeneExprPredModel


## DO NOT EDIT. Generated from notebook/InferenceUsingTFHubEnformerGeneExprPredModel.ipynb using ./maint/process-notebook.pl.

use strict;
use warnings;
use utf8;
use constant IN_IPERL => !! $ENV{PERL_IPERL_RUNNING};
no if IN_IPERL, warnings => 'redefine'; # fewer messages when re-running cells

use feature qw(say);
use Syntax::Construct qw( // );

use lib::projectroot qw(lib);

BEGIN {
    if( IN_IPERL ) {
        $ENV{TF_CPP_MIN_LOG_LEVEL} = 3;
    }
    require AI::TensorFlow::Libtensorflow;
}

use URI ();
use HTTP::Tiny ();
use Path::Tiny qw(path);

use File::Which ();

use List::Util ();

use Data::Printer ( output => 'stderr', return_value => 'void', filters => ['PDL'] );
use Data::Printer::Filter::PDL ();
use Text::Table::Tiny qw(generate_table);

my $s = AI::TensorFlow::Libtensorflow::Status->New;
sub AssertOK {
    die "Status $_[0]: " . $_[0]->Message
        unless $_[0]->GetCode == AI::TensorFlow::Libtensorflow::Status::OK;
    return;
}
AssertOK($s);

use PDL;
use AI::TensorFlow::Libtensorflow::DataType qw(FLOAT);

use FFI::Platypus::Memory qw(memcpy);
use FFI::Platypus::Buffer qw(scalar_to_pointer);

sub FloatPDLTOTFTensor {
    my ($p) = @_;
    return AI::TensorFlow::Libtensorflow::Tensor->New(
        FLOAT, [ reverse $p->dims ], $p->get_dataref, sub { undef $p }
    );
}

sub FloatTFTensorToPDL {
    my ($t) = @_;

    my $pdl = zeros(float,reverse( map $t->Dim($_), 0..$t->NumDims-1 ) );

    memcpy scalar_to_pointer( ${$pdl->get_dataref} ),
        scalar_to_pointer( ${$t->Data} ),
        $t->ByteSize;
    $pdl->upd_data;

    $pdl;
}

lib/AI/TensorFlow/Libtensorflow/Manual/Notebook/InferenceUsingTFHubEnformerGeneExprPredModel.pod  view on Meta::CPAN

=pod

=encoding UTF-8

=head1 NAME

AI::TensorFlow::Libtensorflow::Manual::Notebook::InferenceUsingTFHubEnformerGeneExprPredModel - Using TensorFlow to do gene expression prediction using a pre-trained model

=head1 SYNOPSIS

The following tutorial is based on the L<Enformer usage notebook|https://github.com/deepmind/deepmind-research/blob/master/enformer/enformer-usage.ipynb>. It uses a pre-trained model based on a transformer architecture trained as described in Avsec e...

Running the code requires an Internet connection to download the model (from Google servers) and datasets (from GitHub, UCSC, and NIH).

Some of this code is identical to that of C<InferenceUsingTFHubMobileNetV2Model> notebook. Please look there for explanation for that code. As stated there, this will later be wrapped up into a high-level library to hide the details behind an API.

B<NOTE>: If running this model, please be aware that

=over

=item *

the Docker image takes 3 GB or more of disk space;

=item *

the model and data takes 5 GB or more of disk space.

=back

meaning that you will need a total of B<8 GB> of disk space. You may need at least B<4 GB> of free memory to run the model.

=head1 COLOPHON

The following document is either a POD file which can additionally be run as a Perl script or a Jupyter Notebook which can be run in L<IPerl|https://p3rl.org/Devel::IPerl> (viewable online at L<nbviewer|https://nbviewer.org/github/EntropyOrg/perl-AI-...

You will also need the executables C<gunzip>, C<bgzip>, and C<samtools>. Furthermore,

=over

=item *

C<Bio::DB::HTS> requires C<libhts> and

=item *

C<PDL::Graphics::Gnuplot> requires C<gnuplot>.

=back

If you are running the code, you may optionally install the L<C<tensorflow> Python package|https://www.tensorflow.org/install/pip> in order to access the C<saved_model_cli> command, but this is only used for informational purposes.

=head1 TUTORIAL

=head2 Load the library

First, we need to load the C<AI::TensorFlow::Libtensorflow> library and more helpers. We then create an C<AI::TensorFlow::Libtensorflow::Status> object and helper function to make sure that the calls to the C<libtensorflow> C library are working prop...

  use strict;
  use warnings;
  use utf8;
  use constant IN_IPERL => !! $ENV{PERL_IPERL_RUNNING};
  no if IN_IPERL, warnings => 'redefine'; # fewer messages when re-running cells
  
  use feature qw(say);
  use Syntax::Construct qw( // );
  
  use lib::projectroot qw(lib);
  
  BEGIN {
      if( IN_IPERL ) {
          $ENV{TF_CPP_MIN_LOG_LEVEL} = 3;
      }
      require AI::TensorFlow::Libtensorflow;
  }
  
  use URI ();
  use HTTP::Tiny ();
  use Path::Tiny qw(path);
  
  use File::Which ();
  
  use List::Util ();
  
  use Data::Printer ( output => 'stderr', return_value => 'void', filters => ['PDL'] );
  use Data::Printer::Filter::PDL ();
  use Text::Table::Tiny qw(generate_table);
  
  my $s = AI::TensorFlow::Libtensorflow::Status->New;
  sub AssertOK {
      die "Status $_[0]: " . $_[0]->Message
          unless $_[0]->GetCode == AI::TensorFlow::Libtensorflow::Status::OK;
      return;
  }
  AssertOK($s);

And create helpers for converting between C<PDL> ndarrays and C<TFTensor> ndarrays.

  use PDL;
  use AI::TensorFlow::Libtensorflow::DataType qw(FLOAT);
  
  use FFI::Platypus::Memory qw(memcpy);
  use FFI::Platypus::Buffer qw(scalar_to_pointer);
  
  sub FloatPDLTOTFTensor {
      my ($p) = @_;
      return AI::TensorFlow::Libtensorflow::Tensor->New(
          FLOAT, [ reverse $p->dims ], $p->get_dataref, sub { undef $p }
      );
  }
  
  sub FloatTFTensorToPDL {
      my ($t) = @_;
  
      my $pdl = zeros(float,reverse( map $t->Dim($_), 0..$t->NumDims-1 ) );
  
      memcpy scalar_to_pointer( ${$pdl->get_dataref} ),
          scalar_to_pointer( ${$t->Data} ),
          $t->ByteSize;
      $pdl->upd_data;
  

lib/AI/TensorFlow/Libtensorflow/Manual/Notebook/InferenceUsingTFHubEnformerGeneExprPredModel.pod  view on Meta::CPAN

  
  COMMENT
  
  undef;

=head1 RESOURCE USAGE

  use Filesys::DiskUsage qw/du/;
  
  my $total = du( { 'human-readable' => 1, dereference => 1 },
      $model_archive_path, $model_base, $new_model_base,
  
      $targets_path,
  
      $hg_gz_path,
      $hg_bgz_path, $hg_bgz_fai_path,
  
      $clinvar_path,
  
      $plot_output_path,
  );
  
  say "Disk space usage: $total"; undef;

B<STREAM (STDOUT)>:

  Disk space usage: 4.66G

=head1 CPANFILE

  requires 'AI::TensorFlow::Libtensorflow';
  requires 'AI::TensorFlow::Libtensorflow::DataType';
  requires 'Archive::Extract';
  requires 'Bio::DB::HTS::Faidx';
  requires 'Bio::Location::Simple';
  requires 'Bio::Tools::Run::Samtools';
  requires 'Data::Frame';
  requires 'Data::Printer';
  requires 'Data::Printer::Filter::PDL';
  requires 'Devel::Timer';
  requires 'Digest::file';
  requires 'FFI::Platypus::Buffer';
  requires 'FFI::Platypus::Memory';
  requires 'File::Which';
  requires 'Filesys::DiskUsage';
  requires 'HTTP::Tiny';
  requires 'IPC::Run';
  requires 'List::Util';
  requires 'PDL';
  requires 'PDL::Graphics::Gnuplot';
  requires 'Path::Tiny';
  requires 'Syntax::Construct';
  requires 'Text::Table::Tiny';
  requires 'URI';
  requires 'constant';
  requires 'feature';
  requires 'lib::projectroot';
  requires 'overload';
  requires 'parent';
  requires 'strict';
  requires 'utf8';
  requires 'warnings';

=head1 AUTHOR

Zakariyya Mughal <zmughal@cpan.org>

=head1 COPYRIGHT AND LICENSE

This software is Copyright (c) 2022-2023 by Auto-Parallel Technologies, Inc.

This is free software, licensed under:

  The Apache License, Version 2.0, January 2004

=cut



( run in 0.983 second using v1.01-cache-2.11-cpan-39bf76dae61 )