Algorithm-ConstructDFA2

 view release on metacpan or  search on metacpan

lib/Algorithm/ConstructDFA2.pm  view on Meta::CPAN

      LEFT JOIN match m
        ON (m.vertex = c.vertex AND m.input = i.value)
      LEFT JOIN edge
        ON (m.vertex = edge.src)
      LEFT JOIN closure
        ON (edge.dst = closure.root)
      LEFT JOIN transition t
        ON (t.src = s.state_id AND t.input = i.value)
    WHERE
      t.dst IS NULL
    GROUP BY
      s.state_id, i.rowid
    ORDER BY
      s.state_id, i.rowid
    LIMIT ?
  });

  my @new = $self->_dbh->selectall_array($sth, {}, $limit);

  my $find_or_create = memoize(sub {
    _find_or_create_state_from_vertex_str($self, @_);
  });

  my $sth2 = $self->_dbh->prepare(q{
    INSERT INTO Transition(src, input, dst) VALUES (?, ?, ?)
  });

  my @transitions;

  for my $t (@new) {
    push @transitions, [(
      $t->[0],
      $t->[1],
      $find_or_create->($t->[2]),
    )];
  }

  $self->_dbh->begin_work();
  $sth2->execute(@$_) for @transitions;
  $self->_dbh->commit();

  return scalar @new;
}

sub transitions_as_3tuples {
  my ($self) = @_;

  return $self->_dbh->selectall_array(q{
    SELECT src, input, dst FROM transition
  });
}

sub transitions_as_5tuples {
  my ($self) = @_;

  return $self->_dbh->selectall_array(q{
    SELECT * FROM view_transitions_as_5tuples
  });
}

sub backup_to_file {
  my ($self, $schema_version, $file) = @_;
  die unless $schema_version eq 'v0';
  $self->_dbh->sqlite_backup_to_file($file);
}

# sub backup_to_dbh {
#   my ($self, $schema_version) = @_;
# 
#   die unless $schema_version eq 'v0';
# 
#   require File::Temp;
# 
#   my ($fh, $filename) = File::Temp::tempfile();
# 
#   $self->_dbh->sqlite_backup_to_file($filename);
# 
#   my $dbh = DBI->connect('dbi:SQLite:dbname=:memory:');
# 
#   $dbh->sqlite_backup_from_file($filename);
# 
#   File::Temp::unlink0($fh, $filename);
# 
#   undef $fh;
# 
#   return $dbh;
# }

1;

__END__

=head1 NAME

Algorithm::ConstructDFA2 - Deterministic finite automaton construction

=head1 SYNOPSIS

  use Algorithm::ConstructDFA2;

  my $dfa = Algorithm::ConstructDFA2->new(
    input_alphabet     => [ @symbols ],
    input_vertices     => [ qw/ 2 3 4 / ],
    input_edges        => [ [ 2, 3 ], [ 3, 4 ] ],

    vertex_nullable    => sub($vertex)         { ... },
    vertex_matches     => sub($vertex, $input) { ... },

    storage_dsn        => 'dbi:SQLite:dbname=...',
  );

  my $start_id = $dfa->find_or_create_state_id(qw/ 2 /);

  while (my $count = $dfa->compute_some_transitions(1_000)) {
    ...
  }

  my @accepting = $dfa->cleanup_dead_states(sub(@vertices) {
    ...
  });

=head1 DESCRIPTION

This module computes deterministic finite automata from equivalent
non-deterministic finite automata. The input NFA must be expressed
as directed graph with labeled vertices. Vertex labels indicate if
vertices match a particular terminal symbol from an input alphabet,
or match the empty string, meaning they can be crossed without any
input when matching a string.

This is slightly different from how NFA graphs are usually encoded
in literature (as graph with labeled edges), but the conversion is
straightforward (turn edges into additional vertices). Finding a
suitable alphabet is more difficult, L<Set::IntSpan::Partition> can
help with that (the module splits sets of sets of terminals like
"letters" and "digits" and "hexdigits" into non-overlapping sets,
each of which can then be used as a terminal for this module).

DFAs can be exponentially larger than equivalent NFAs; to accomodate
large or complicated NFAs, computed data is held in a SQLite database

lib/Algorithm/ConstructDFA2.pm  view on Meta::CPAN


=item $dfa->find_or_create_state_id(@vertices)

Given a list of vertices, computes a new state, adds it to the
automaton if it does not already exist, and returns an identifier
for the state. This is used to create a start state in the DFA.

=item $dfa->compute_some_transitions($limit)

Computes up to C<$limit> additional transitions and returns the
number of transitions actually computed. A return value of zero
indicates that all transitions have been computed.

=item $dfa->dead_state_id()

Returns the state identifier for a fixed dead state (from which
no accepting configuration can be reached).

=item $dfa->cleanup_dead_states(\&vertices_accept)

Given a code reference that takes a list of vertices and returns
true if and only if the vertices are an accepting configuration,
this method changes the automaton so that dead states have no
transitions to different dead states.

If, for example, the input NFA has a special "final" vertex that
indicates acceptance when reached, the code reference would check
if the vertex list contains this vertex.

=item $dfa->transitions_as_3tuples()

Returns a list of all transitions computed so far as. Transitions
are arrays with three identifiers for the source state, the input
symbol, and the destination state.

  for my $transition ( $dfa->transitions_as_3tuples() ) {
    my ($src_state, $input, $dst_state) = @$transition;
    ...
  }

=item $dfa->vertices_in_state($state_id)

Returns a list of vertices in the state C<$state_id>.

=item $dfa->transitions_as_5tuples()

Returns a list of all transitions computed so far as. Transitions
are arrays with five identifiers: the source state, an input vertex
included in the source state, the input symbol, the destination state
and an input vertex included in the destination state.

  for my $transition ( $dfa->transitions_as_5tuples() ) {
    my ($src_state, $src_vertex, $input, $dst_state, $dst_vertex) =
      @$transition;
    ...
  }

Note that unlike C<transitions_as_3tuples> this omits transitions
involving the main dead state.

=item $dfa->backup_to_file('v0', $file)

Create a backup of the database used to store input and computed data
into C<$file>. The first parameter must be C<v0> and indicates the
version of the database schema.

=back

=head1 TODO

=over

=item * It does not make sense for C<transitions_as_5tuples> and its
        companions to return a list for large automata. But short of
        returning the DBI statement handle there does not seem to be
        a good way to return something more lazy.

=item * ...

=back

=head1 BUG REPORTS

Please report bugs in this module via
L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Algorithm-ConstructDFA2>

=head1 SEE ALSO

=over

=item * L<Set::IntSpan::Partition> - Useful to create alphabets from sets

=item * L<Acme::Partitioner> - Useful to minimise automata

=item * L<Algorithm::ConstructDFA> - obsolete predecessor

=item * L<Algorithm::ConstructDFA::XS> - obsolete predecessor

=back

=head1 ACKNOWLEDGEMENTS

Thanks to Slaven Rezic for bug reports.

=head1 AUTHOR / COPYRIGHT / LICENSE

  Copyright (c) 2017-2018 Bjoern Hoehrmann <bjoern@hoehrmann.de>.
  This module is licensed under the same terms as Perl itself.

=cut



( run in 1.567 second using v1.01-cache-2.11-cpan-39bf76dae61 )