BOM results from the CPAN

Parse-CSV
view release on metacpan or search on metacpan

=head2 new

The C<new> constructor creates and initialises a new CSV parser.  It
returns a new L<Parse::CSV> object, or throws an exception (dies) on
error.  It accepts a number of params:

=over 4

=item C<file>

=item C<handle>

To specify the CSV data source, provide either the C<file>
param, which should be the name of the file to read, or the C<handle>
param, which should be a file handle to read instead.

=item C<csv_attr>

Any parameter for L<Text::CSV_XS>'s constructor can also be provided
to this C<new> method, and they will be passed on to it.
Alternatively, they can be passed as a single C<HASH> reference as the
C<csv_attr> param. For example:

  $parser = Parse::CSV->new(
      file     => 'file.csv',
      csv_attr => {
          sep_char   => ';',
          quote_char => "'",
      },
  );

=item C<names>

An optional C<names> param can be provided, which should either be an
array reference containing the names of the columns:

  $parser = Parse::CSV->new(
      file  => 'file.csv',
      names => [ 'col1', 'col2', 'col3' ],
  );

or a true value that's not a reference, indicating that the column
names will be read from the first line of the input:

  $parser = Parse::CSV->new(
      file  => 'file.csv',
      names => 1,
  );

If the C<names> param is provided, the parser will map each line to a
hash where the keys are the field names provided, and the values are the
values found in the CSV file.

If the C<names> param is B<not> provided, the parser will return simple
array references of the columns, treating them just like all the other
rows in the file.

If your CSV file has (or might have) a <Byte-Order Mark|https://en.wikipedia.org/wiki/Byte_order_mark>,
you must use the C<names> functionality, because this lets us call the C<header>
method of C<Text::CSV_XS>, which is the only place the BOM is handled
in that module.

=item C<filter>

The optional C<filter> param will be used to filter the records if
provided. It should be a C<CODE> reference or any otherwise callable
scalar, and each value parsed (either array reference or hash reference)
will be available to the filter as C<$_> to be changed or converted into an object,
or whatever you wish.  See the L</Writing Filters> section for more details.

=back

=cut

sub new {
	my $class = shift;
	my $self  = bless {
		@_,
		row    => 0,
		errstr => '',
	}, $class;

	# Do we have a file name
	if ( exists $self->{file} ) {
		unless ( Params::Util::_STRING($self->{file}) ) {
			Carp::croak("Parse::CSV file param is not a string");
		}
		unless ( -f $self->{file} and -r _ ) {
			Carp::croak("Parse::CSV file '$self->{file}' does not exist");
		}

		$self->{handle} = IO::File->new();
		unless ( $self->{handle}->open($self->{file}) ) {
			Carp::croak("Parse::CSV file '$self->{file}' failed to load: $!");
		}
	}

	# Do we have a file handle
	if ( exists $self->{handle} ) {
		unless ( Params::Util::_HANDLE($self->{handle}) ) {
			Carp::croak("Parse::CSV handle param is not an IO handle");
		}
	} else {
		Carp::croak("Parse::CSV not provided a file or handle param");
	}

	# Separate the Text::CSV attributes
	unless ( Params::Util::_HASH0($self->{csv_attr}) ) {
		$self->{csv_attr} = {binary => 1};  # Suggested by Text::CSV_XS docs to always be on
		# XXX it would be nice to not have this list hard-coded.
		foreach ( qw{quote_char eol escape_char sep_char binary always_quote} ) {
			next unless exists $self->{$_};
			$self->{csv_attr}->{$_} = delete $self->{$_};
		}
	}

	# Create the parser
	$self->{csv_xs} = Text::CSV_XS->new( $self->{csv_attr} );
	unless ( $self->{csv_xs} ) {
		Carp::croak("Failed to create Text::CSV_XS parser");
( run in 3.687 seconds using v1.01-cache-2.11-cpan-2398b32b56e )