Excel-ValueReader-XLSX

 view release on metacpan or  search on metacpan

lib/Excel/ValueReader/XLSX.pm  view on Meta::CPAN

  push @range, ($3 ? ($A1_to_num_memoized{$3} //= $self->A1_to_num($3), $4)  # col, row of bottomright cell, or ..
                   : @range);                                                # .. copy of topleft cell
  return @range;
}






1;


__END__

=head1 NAME


Excel::ValueReader::XLSX - extracting values from Excel workbooks in XLSX format, fast

=head1 SYNOPSIS

  my $reader = Excel::ValueReader::XLSX->new(xlsx => $filename_or_handle);
  # .. or with syntactic sugar :
  my $reader = Excel::ValueReader::XLSX->new($filename_or_handle);
  # .. or with LibXML backend :
  my $reader = Excel::ValueReader::XLSX->new(xlsx => $filename_or_handle,
                                             using => 'LibXML');
  
  foreach my $sheet_name ($reader->sheet_names) {
     my $grid = $reader->values($sheet_name);
     my $n_rows = @$grid;
     print "sheet $sheet_name has $n_rows rows; ",
           "first cell contains : ", $grid->[0][0];
  }
  
  foreach my $table_name ($reader->table_names) {
     my ($columns, $records) = $reader->table($table_name);
     my $n_records           = @$records;
     my $n_columns           = @$columns;
     print "table $table_name has $n_records records and $n_columns columns; ",
           "column 'foo' in first row contains : ", $records->[0]{foo};
  }
  
  my $first_grid = $reader->values(1); # the arg can be a sheet index instead of a sheet name
  
  # iterator version of ->values()
  my $iterator = $reader->ivalues($sheet_name);
  while (my $row = $iterator->()) { process_row($row) }

  # iterator version of ->table()
  my ($columns, $iterator) = $reader->itable($table_name);
  while (my $record = $iterator->()) { process_record($record) }

=head1 DESCRIPTION

=head2 Purpose

This module reads the contents of an Excel file in XLSX format.
Unlike other modules like L<Spreadsheet::ParseXLSX> or L<Data::XLSX::Parser>, 
this module has no support for reading formulas, formats or other Excel internal
information; all you get are plain values -- but you get them much faster ! 

Besides, this API has some features not found in concurrent parsers :

=over

=item *

has support for parsing Excel tables

=item *

iterator methods for getting one row at a time from a worksheet or from a table -- very useful for sparing
memory when dealing with large Excel files.

=back


=head2 Backends

Two different backends may be used for extracting values :

=over

=item Regex

using regular expressions to parse the XML content.

=item LibXML

using L<XML::LibXML::Reader> to parse the XML content.
It is probably safer but two to three times slower than the Regex backend
(but still much faster than L<Spreadsheet::ParseXLSX>).

=back

The default is the C<Regex> backend.

=head2 Sheet numbering

Although worksheets are usually accessed by name, they may also
be accessed by numerical indices, I<starting at value 1>.
Some other Perl parsing modules use a different convention, where the first sheet has index 0.
Here index 1 was chosen to be consistent with the common API for "collections" in
Microsoft Office object model.


=head1 NOTE ON ITERATORS

Methods L</ivalues> and L</itable> return I<iterators>.
Each call to the iterator produces a new data row from the Excel content, until reaching
the end of data where the iterator returns C<undef>. Following the L<Iterator::Simple> protocol,
iterators support three different but semantically equivalent syntaxes :

  while (my $row = $iterator->())   { process($row) }
  
  while (my $row = $iterator->next) { process($row) }
  
  while (<$iterator>)               { process($_) }



( run in 0.751 second using v1.01-cache-2.11-cpan-39bf76dae61 )