Acme-Sort-Sleep

 view release on metacpan or  search on metacpan

local/lib/perl5/IO/Async/FileStream.pm  view on Meta::CPAN

}

=head2 seek

   $filestream->seek( $offset, $whence )

Callable only during the C<on_initial> event. Moves the read position in the
filehandle to the given offset. C<$whence> is interpreted as for C<sysseek>,
should be either C<SEEK_SET>, C<SEEK_CUR> or C<SEEK_END>. Will be set to
C<SEEK_SET> if not provided.

Normally this would be used to seek to the end of the file, for example

 on_initial => sub {
    my ( $self, $filesize ) = @_;
    $self->seek( $filesize );
 }

=cut

sub seek
{
   my $self = shift;
   my ( $offset, $whence ) = @_;

   $self->{running_initial} or croak "Cannot ->seek except during on_initial";

   defined $whence or $whence = SEEK_SET;

   sysseek( $self->read_handle, $offset, $whence );
}

=head2 seek_to_last

   $success = $filestream->seek_to_last( $str_pattern, %opts )

Callable only during the C<on_initial> event. Attempts to move the read
position in the filehandle to just after the last occurance of a given match.
C<$str_pattern> may be a literal string or regexp pattern. 

Returns a true value if the seek was successful, or false if not. Takes the
following named arguments:

=over 8

=item blocksize => INT

Optional. Read the file in blocks of this size. Will take a default of 8KiB if
not defined.

=item horizon => INT

Optional. Give up looking for a match after this number of bytes. Will take a
default value of 4 times the blocksize if not defined.

To force it to always search through the entire file contents, set this
explicitly to C<0>.

=back

Because regular file reading happens synchronously, this entire method
operates entirely synchronously. If the file is very large, it may take a
while to read back through the entire contents. While this is happening no
other events can be invoked in the process.

When looking for a string or regexp match, this method appends the
previously-read buffer to each block read from the file, in case a match
becomes split across two reads. If C<blocksize> is reduced to a very small
value, take care to ensure it isn't so small that a match may not be noticed.

This is most likely useful for seeking after the last complete line in a
line-based log file, to commence reading from the end, while still managing to
capture any partial content that isn't yet a complete line.

 on_initial => sub {
    my $self = shift;
    $self->seek_to_last( "\n" );
 }

=cut

sub seek_to_last
{
   my $self = shift;
   my ( $str_pattern, %opts ) = @_;

   $self->{running_initial} or croak "Cannot ->seek_to_last except during on_initial";

   my $offset = $self->{last_size};

   my $blocksize = $opts{blocksize} || 8192;

   defined $opts{horizon} or $opts{horizon} = $blocksize * 4;
   my $horizon = $opts{horizon} ? $offset - $opts{horizon} : 0;
   $horizon = 0 if $horizon < 0;

   my $re = ref $str_pattern ? $str_pattern : qr/\Q$str_pattern\E/;

   my $prev = "";
   while( $offset > $horizon ) {
      my $len = $blocksize;
      $len = $offset if $len > $offset;
      $offset -= $len;

      sysseek( $self->read_handle, $offset, SEEK_SET );
      sysread( $self->read_handle, my $buffer, $blocksize );

      # TODO: If $str_pattern is a plain string this could be more efficient
      # using rindex
      if( () = ( $buffer . $prev ) =~ m/$re/sg ) {
         # $+[0] will be end of last match
         my $pos = $offset + $+[0];
         $self->seek( $pos );
         return 1;
      }

      $prev = $buffer;
   }

   $self->seek( $horizon );
   return 0;
}



( run in 0.993 second using v1.01-cache-2.11-cpan-39bf76dae61 )