App-Pipeline-Simple

 view release on metacpan or  search on metacpan

lib/App/Pipeline/Simple.pm  view on Meta::CPAN

	foreach my $arg (@{$step->{arg}}) {
	    if ($arg->{key} eq 'out') {
		for ($step->each_next) {
		    push @res, "\n\t", "WARNING: Output file [".
			$arg->{value}."] is read by [",
			$outputs->{$arg->{value}}, "] and [$_]"
		    if  $outputs->{$arg->{value}};

		    $outputs->{$arg->{value}} = $_;
		}
	    }
	    elsif ($arg->{key} eq 'in' and $arg->{type} ne 'redir') {
		my $prev_step_id = $outputs->{$arg->{value}} || '';
		push @res, "\n\t". "ERROR: Output from the previous step is not [".
		    ($arg->{value} || ''). "]"
		    if $prev_step_id ne $step->id and $prev_step_id eq $self->id;
	    }
	    # test for steps not referenced by other steps (missing next tag)
	}
	push @res, "\n";
    }
    return join '', @res;
}


sub graphviz {
    my $self = shift;
    my $function = shift;

    $self->logger->info("Graphing started. Redirect to a dot file" );

    require GraphViz;
    my $g= GraphViz->new;

    my $end;
    $g->add_node($self->id,
		 label => $self->id.
		 $self->render('4display'), rank => 'top');
    map {  $g->add_edge('s0' => $_) }  $self->each_next;
    if ($self->description) {
	$g->add_node('desc', label => $self->description,
		     shape => 'box', rank => 'top');
	$g->add_edge('s0' => 'desc');
    }

    foreach my $step ($self->each_step) {
	$g->add_node($step->id, label => $step->id );
	if ($step->each_next) {
	    map {  $g->add_edge($step->id => $_, label => " ". $step->render('display') ) }
		$step->each_next;
	} else {
	    $end++;
	    $g->add_node($end, label => ' ');
	    $g->add_edge($step->id => $end, label => " ". $step->render('display') );
	}

    }
    return $g->as_dot;

    $self->logger->info("Graphing done. Process the dot ".
			 "file (e.g. dot -Tpng p.dot|display " );

}

1;

__END__

=pod

=head1 NAME

App::Pipeline::Simple - Simple workflow manager

=head1 VERSION

version 0.9.1

=head1 SYNOPSIS

  # called from a script

=head1 DESCRIPTION

Unless you want to change or extend the module, you probably do not
need to read this documentation. Runtime information is in L<spipe>
application.

Workflow management in computational (biological) sciences is a hard
problem. This module is based on assumption that UNIX pipe and
redirect system is closest to optimal solution with these
improvements:

* Enforce the storing of all intermediate steps in a file.

  This is for clarity, accountability and to enable arbitrarily big
  data sets. Pipeline can contain independent steps that remove
  intermediate files if so required.

* Naming of each step.

  This is to make it possible to stop, restart, and restart at any
  intermediate step after adjusting pipeline parameters.

* detailed logging

  To keep track of all runs of the pipeline.

A pipeline is a collection of steps that are functionally equivalent
to a pipeline. In other words, execution of a pipeline equals to
execution of a each ordered step within the pipeline. From that
derives that the pipeline object model needs only one class that can
recursively represent the whole pipeline as well as individual steps.

=head1 METHODS

=head2 new

Constructor for the class. One instance represents the whole pipeline,
and other instances are created for each step in the pipeline.



( run in 1.171 second using v1.01-cache-2.11-cpan-df04353d9ac )