App-CSVUtils

 view release on metacpan or  search on metacpan

lib/App/CSVUtils.pm  view on Meta::CPAN


    field1,field2,"",field4

_
        tags => ['category:output'],
    },
);

our %argspecopt_input_filename = (
    input_filename => {
        summary => 'Input CSV file',
        description => <<'_',

Use `-` to read from stdin.

Encoding of input file is assumed to be UTF-8.

_
        schema => 'filename*',
        default => '-',
        'x.completion' => $xcomp_csvfiles,
        tags => ['category:input'],
    },
);

our %argspecopt_input_filenames = (
    input_filenames => {
        'x.name.is_plural' => 1,
        'x.name.singular' => 'input_filename',
        summary => 'Input CSV files',
        description => <<'_',

Use `-` to read from stdin.

Encoding of input file is assumed to be UTF-8.

_
        schema => ['array*', of=>'filename*'],
        default => ['-'],
        'x.completion' => $xcomp_csvfiles,
        tags => ['category:input'],
    },
);

our %argspecopt_overwrite = (
    overwrite => {
        summary => 'Whether to override existing output file',
        schema => 'bool*',
        cmdline_aliases=>{O=>{}},
        tags => ['category:output'],
    },
);

our %argspecsopt_inplace = (
    inplace => {
        summary => 'Output to the same file as input',
        schema => 'true*',
        description => <<'_',

Normally, you output to a different file than input. If you try to output to the
same file (`-o INPUT.csv -O`) you will clobber the input file; thus the utility
prevents you from doing it. However, with this `--inplace` option, you can
output to the same file. Like perl's `-i` option, this will first output to a
temporary file in the same directory as the input file then rename to the final
file at the end. You cannot specify output file (`-o`) when using this option,
but you can specify backup extension with `-b` option.

Some caveats:

- if input file is a symbolic link, it will be replaced with a regular file;
- renaming (implemented using `rename()`) can fail if input filename is too long;
- value specified in `-b` is currently not checked for acceptable characters;
- things can also fail if permissions are restrictive;

_
        tags => ['category:output'],
    },
    inplace_backup_ext => {
        summary => 'Extension to add for backup of input file',
        schema => 'str*',
        default => '',
        description => <<'_',

In inplace mode (`--inplace`), if this option is set to a non-empty string, will
rename the input file using this extension as a backup. The old existing backup
will be overwritten, if any.

_
        cmdline_aliases => {b=>{}},
        tags => ['category:output'],
    },
);

our %argspecopt_output_filename = (
    output_filename => {
        summary => 'Output filename',
        description => <<'_',

Use `-` to output to stdout (the default if you don't specify this option).

Encoding of output file is assumed to be UTF-8.

_
        schema => 'filename*',
        cmdline_aliases=>{o=>{}},
        tags => ['category:output'],
    },
);

our %argspecopt_output_filenames = (
    output_filenames => {
        summary => 'Output filenames',
        description => <<'_',

Use `-` to output to stdout (the default if you don't specify this option).

Encoding of output file is assumed to be UTF-8.

_
        schema => ['array*', of=>'filename*'],
        cmdline_aliases=>{o=>{}},

lib/App/CSVUtils.pm  view on Meta::CPAN

  zero. Will be reset for each CSV file.

- `input_data_row_count`, int. Contains the number of actual data rows that have
  read. Will be reset for each CSV file.

If you are outputting CSV (`writes_csv` gen argument set to true), the following
keys will be available:

- `output_emitter`, a <pm:Text::CSV_XS> instance for output.

- `output_fields`, array of str. Should be set to list of output field names. If
  unset, will be set to be the same as `input_fields`.

- `output_fields_idx`, hash with field names as keys and field indexes (0-based
  integer) as values. Normally you do not need to set this manually; you just
  need to set `output_fields` and this hash will be computed automatically for
  you just before the first output row is outputted.

- `output_filenames`, array of str.

- `output_filename`, str, name of current output file.

- `output_filenum`, uint, the number of the current output file, 1 being the
  first file, 2 for the second, and so on.

- `output_fh`, handle to the current output file.

- `output_rownum`, uint. The number of rows that have been outputted (reset
  after each output file).

- `output_data_rownum`, uint. The number of data rows that have been outputted
  (reset after each output file). This will be equal to `input_rownum` less 1 if
  input file has header.

For other hook-specific keys, see the documentation for associated hook point.


*ACCEPTING ADDITIONAL COMMAND-LINE OPTIONS/ARGUMENTS*

As mentioned above, you will get additional command-line options/arguments in
`$r->{util_args}` hashref. Some options/arguments are already added by
`gen_csv_util`, e.g. `input_filename` or `input_filenames` along with
`input_sep_char`, etc (when your utility declares `reads_csv`),
`output_filename` or `output_filenames` along with `overwrite`,
`output_sep_char`, etc (when your utility declares `writes_csv`).

If you want to accept additional arguments/options, you specify them in
`add_args` (hashref, with key being Each option/argument has to be specified
first via `add_args` (as hashref, with key being argument name and value the
argument specification as defined in <pm:Rinci::function>)). Some argument
specifications have been defined in <pm:App::CSVUtils> and can be used. See
existing utilities for examples.


*READING CSV DATA*

To read CSV data, normally your utility would provide handler for the
`on_input_data_row` hook and sometimes additionally `on_input_header_row`.


*OUTPUTTING STRING OR RETURNING RESULT*

To output string, usually you call the provided routine `$r->{code_print}`. This
routine will open the output files for you.

You can also return enveloped result directly by setting `$r->{result}`.


*OUTPUTTING CSV DATA*

To output CSV data, usually you call the provided routine `$r->{code_print_row}`.
This routine accepts a row (arrayref or hashref). This routine will open the
output files for you when needed, as well as print header row automatically.

You can also buffer rows from input to e.g. `$r->{output_rows}`, then call
`$r->{code_print_row}` repeatedly in the `after_read_input` hook to print all the
rows.


*READING MULTIPLE CSV FILES*

To read multiple CSV files, you first specify `reads_multiple_csv`. Then, you
can supply handler for `on_input_header_row` and `on_input_data_row` as usual.
If you want to do something before/after each input file, you can also supply
handler for `before_open_input_file` or `after_close_input_file`.


*WRITING TO MULTIPLE CSV FILES*

Similarly, to write to many CSv files, you first specify `writes_multiple_csv`.
Then, you can supply handler for `on_input_header_row` and `on_input_data_row`
as usual. To switch to the next file, set
`$r->{wants_switch_to_next_output_file}` to true, in which case the next call to
`$r->{code_print_row}` will close the current file and open the next file.


*CHANGING THE OUTPUT FIELDS*

When calling `$r->{code_print_row}`, you can output whatever fields you want. By
convention, you can set `$r->{output_fields}` and `$r->{output_fields_idx}` to
let other handlers know about the output fields. For example, see the
implementation of <prog:csv-concat>.

_
    args => {
        name => {
            schema => 'perl::identifier::unqualified_ascii*',
            req => 1,
            tags => ['category:metadata'],
        },
        summary => {
            schema => 'str*',
            tags => ['category:metadata'],
        },
        description => {
            schema => 'str*',
            tags => ['category:metadata'],
        },
        links => {
            schema => ['array*', of=>'hash*'], # XXX defhashes
            tags => ['category:metadata'],
        },
        examples => {
            schema => ['array*'], # defhashes
            tags => ['category:metadata'],
        },
        add_meta_props => {
            summary => 'Add additional Rinci function metadata properties',
            schema => ['hash*'],
            tags => ['category:metadata'],
        },
        add_args => {
            schema => ['hash*'],
            tags => ['category:metadata'],
        },
        add_args_rels => {
            schema => ['hash*'],
            tags => ['category:metadata'],
        },

        reads_csv => {
            summary => 'Whether utility reads CSV data',
            'summary.alt.bool.not' => 'Specify that utility does not read CSV data',
            schema => 'bool*',
            default => 1,
        },
        reads_multiple_csv => {
            summary => 'Whether utility accepts CSV data',
            schema => 'bool*',
            description => <<'_',

Setting this option to true will implicitly set the `reads_csv` option to true,
obviously.

_
        },
        writes_csv => {

lib/App/CSVUtils.pm  view on Meta::CPAN

    my $before_open_input_file   = delete $gen_args{before_open_input_file};
    my $on_input_header_row      = delete $gen_args{on_input_header_row};
    my $on_input_data_row        = delete $gen_args{on_input_data_row};
    my $after_close_input_file   = delete $gen_args{after_close_input_file};
    my $after_close_input_files  = delete $gen_args{after_close_input_files};
    my $after_read_input         = delete $gen_args{after_read_input};
    my $on_end                   = delete $gen_args{on_end};

    scalar(keys %gen_args) and die "Unknown argument(s): ".join(", ", keys %gen_args);

    my $code;
  CREATE_CODE: {
        $code = sub {
            my %util_args = @_;

            my $has_header = $util_args{input_header} // 1;
            my $outputs_header = $util_args{output_header} // $has_header;

            my $r = {
                gen_args => \%gen_args,
                util_args => \%util_args,
                name => $name,
            };

            # inside the main eval block, we call hook handlers. A handler can
            # throw an exception (which can be a string or an enveloped response
            # like [500, "some error message"], see Rinci::function). we trap
            # the exception so we can return the appropriate enveloped response.
          MAIN_EVAL:
            eval {

                # do some checking
                if ($util_args{inplace} && (!$reads_csv || !$writes_csv)) {
                    die [412, "--inplace cannot be specified when we do not read & write CSV"];
                }

                if ($on_begin) {
                    log_trace "[csvutil] Calling on_begin hook handler ...";
                    $on_begin->($r);
                }

                my $code_open_file = sub {
                    # set output filenames, if not yet
                    unless ($r->{output_filenames}) {
                        my @output_filenames;
                        if ($util_args{inplace}) {
                            for my $input_filename (@{ $r->{input_filenames} }) {
                                my $output_filename;
                                while (1) {
                                    $output_filename = $input_filename . "." . _randext(5);
                                    last unless -e $output_filename;
                                }
                                push @output_filenames, $output_filename;
                            }
                        } elsif ($writes_multiple_csv) {
                            @output_filenames = @{ $util_args{output_filenames} // ['-'] };
                        } else {
                            @output_filenames = ($util_args{output_filename} // '-');
                        }

                      CHECK_OUTPUT_FILENAME_SAME_AS_INPUT_FILENAME: {
                            my %seen_output_abs_path; # key = output filename
                            last unless $reads_csv && $writes_csv;
                            for my $input_filename (@{ $r->{input_filenames} }) {
                                next if $input_filename eq '-';
                                my $input_abs_path = Cwd::abs_path($input_filename);
                                die [500, "Can't get absolute path of input filename '$input_filename'"] unless $input_abs_path;
                                for my $output_filename (@output_filenames) {
                                    next if $output_filename eq '-';
                                    next if $seen_output_abs_path{$output_filename};
                                    my $output_abs_path = Cwd::abs_path($output_filename);
                                    die [500, "Can't get absolute path of output filename '$output_filename'"] unless $output_abs_path;
                                    die [412, "Cannot set output filename to '$output_filename' ".
                                         ($output_filename ne $output_abs_path ? "($output_abs_path) ":"").
                                         "because it is the same as input filename and input will be clobbered; use --inplace to avoid clobbering<"]
                                        if $output_abs_path eq $input_abs_path;
                                }
                            }
                        } # CHECK_OUTPUT_FILENAME_SAME_AS_INPUT_FILENAME

                        $r->{output_filenames} = \@output_filenames;
                        $r->{output_num_of_files} //= scalar(@output_filenames);
                    } # set output filenames

                    # open the next file, if not yet
                    if (!$r->{output_fh} || $r->{wants_switch_to_next_output_file}) {
                        $r->{output_filenum} //= 0;
                        $r->{output_filenum}++;

                        $r->{output_rownum} = 0;
                        $r->{output_data_rownum} = 0;

                        # close the previous file, if any
                        if ($r->{output_fh} && $r->{output_filename} ne '-') {
                            log_info "[csvutil] Closing output file '$r->{output_filename}' ...";
                            close $r->{output_fh} or die [500, "Can't close output file '$r->{output_filename}': $!"];
                            delete $r->{has_printed_header};
                            delete $r->{wants_switch_to_next_output_file};
                        }

                        # we have exhausted all the files, do nothing & return
                        return if $r->{output_filenum} > @{ $r->{output_filenames} };

                        $r->{output_filename} = $r->{output_filenames}[ $r->{output_filenum}-1 ];
                        log_info "[csvutil] [%d/%s] Opening output file %s ...",
                            $r->{output_filenum}, $r->{output_num_of_files}, $r->{output_filename};
                        if ($r->{output_filename} eq '-') {
                            $r->{output_fh} = \*STDOUT;
                        } else {
                            if (-f $r->{output_filename}) {
                                if ($r->{util_args}{overwrite}) {
                                    log_info "[csvutil] Will be overwriting output file %s", $r->{output_filename};
                                } else {
                                    die [412, "Refusing to overwrite existing output file '$r->{output_filename}', choose another name or use --overwrite (-O)"];
                                }
                            }
                            my ($fh, $err) = _open_file_write($r->{output_filename});
                            die $err if $err;
                            $r->{output_fh} = $fh;
                        }
                    } # open the next file
                }; # code_open_file

                my $code_print = sub {
                    my $str = shift;
                    $code_open_file->();
                    print { $r->{output_fh} } $str;
                }; # code_print
                $r->{code_print} = $code_print;

                if ($writes_csv) {
                    my $output_emitter = _instantiate_emitter(\%util_args);
                    $r->{output_emitter} = $output_emitter;
                    $r->{has_printed_header} = 0;

                    my $code_print_header_row = sub {
                        # set output fields, if not yet
                        unless ($r->{output_fields}) {
                            # by default, use the

lib/App/CSVUtils.pm  view on Meta::CPAN

                            $r->{output_fields_idx} = {};
                            for my $j (0 .. $#{ $r->{output_fields} }) {
                                $r->{output_fields_idx}{ $r->{output_fields}[$j] } = $j;
                            }
                        }

                        $code_open_file->();

                        # print header line, if not yet
                        if ($outputs_header && !$r->{has_printed_header}) {
                            $r->{has_printed_header}++;
                            $r->{output_emitter}->print($r->{output_fh}, $r->{output_fields});
                            print { $r->{output_fh} } "\n";
                            $r->{output_rownum}++;
                        }
                    };
                    $r->{code_print_header_row} = $code_print_header_row;

                    my $code_print_row = sub {
                        my $row = shift;

                        $code_print_header_row->();

                        # print data line
                        if ($row) {
                            if (ref $row eq 'HASH') {
                                my $row0 = $row;
                                $row = [];
                                for my $j (0 .. $#{ $r->{output_fields} }) {
                                    $row->[$j] = $row0->{ $r->{output_fields}[$j] } // '';
                                }
                            }
                            $r->{output_emitter}->print( $r->{output_fh}, $row );
                            print { $r->{output_fh} } "\n";
                            $r->{output_rownum}++;
                            $r->{output_data_rownum}++;
                        }
                    }; # code_print_row
                    $r->{code_print_row} = $code_print_row;
                } # if outputs csv

                if ($before_read_input) {
                    log_trace "[csvutil] Calling before_read_input handler ...";
                    $before_read_input->($r);
                }

              READ_CSV: {
                    last unless $reads_csv;

                    my $input_parser = _instantiate_parser(\%util_args, 'input_');
                    $r->{input_parser} = $input_parser;

                    my @input_filenames;
                    if ($reads_multiple_csv) {
                        @input_filenames = @{ $util_args{input_filenames} // ['-'] };
                    } else {
                        @input_filenames = ($util_args{input_filename} // '-');
                    }
                    $r->{input_filenames} //= \@input_filenames;

                  BEFORE_INPUT_FILENAME:
                    $r->{input_filenum} = 0;

                  INPUT_FILENAME:
                    for my $input_filename (@input_filenames) {
                        $r->{input_filenum}++;
                        $r->{input_filename} = $input_filename;
                        $r->{input_file_input_has_been_skipped} = 0;

                        if ($r->{input_filenum} == 1 && $before_open_input_files) {
                            log_trace "[csvutil] Calling before_open_input_files handler ...";
                            $before_open_input_files->($r);
                            if (delete $r->{wants_skip_files}) {
                                log_trace "[csvutil] Handler wants to skip files, skipping all input files";
                                last READ_CSV;
                            }
                        }

                        if ($before_open_input_file) {
                            log_trace "[csvutil] Calling before_open_input_file handler ...";
                            $before_open_input_file->($r);
                            if (delete $r->{wants_skip_file}) {
                                log_trace "[csvutil] Handler wants to skip this file, moving on to the next file";
                                next INPUT_FILENAME;
                            } elsif (delete $r->{wants_skip_files}) {
                                log_trace "[csvutil] Handler wants to skip all files, skipping all input files";
                                last READ_CSV;
                            }
                        }

                        log_info "[csvutil] [file %d/%d] Reading input file %s ...",
                            $r->{input_filenum}, scalar(@input_filenames), $input_filename;
                        my ($fh, $err) = _open_file_read($input_filename);
                        die $err if $err;
                        $r->{input_fh} = $r->{input_fhs}[ $r->{input_filenum}-1 ] = $fh;

                        my $i;
                        $r->{input_header_row_count} = 0;
                        $r->{input_data_row_count} = 0;
                        $r->{input_fields} = []; # array, field names in order
                        $r->{input_field_idxs} = {}; # key=field name, value=index (0-based)
                        my $row0;
                        my $code_getline = sub {
                            if ($r->{stdin_input_fields} && $r->{input_filename} eq '-') {
                                if ($i == 0) {
                                    # we have read the header for stdin. since
                                    # we can't seek to the beginning, we return
                                    # the saved fields
                                    $r->{input_header_row_count}++;
                                    return $r->{stdin_input_fields};
                                } else {
                                    my $row = $input_parser->getline($r->{input_fh});
                                    $r->{input_data_row_count}++ if $row;
                                    return $row;
                                }
                            }

                            # handle skipping lines before the first row
                            unless ($r->{input_file_input_has_been_skipped}++) {
                                if ($r->{util_args}{input_skip_num_lines}) {
                                    for my $j (1 .. $r->{util_args}{input_skip_num_lines}) {
                                        my $line = readline($r->{input_fh});
                                        return unless $line;
                                    }
                                } elsif ($r->{util_args}{input_skip_until_pattern}) {
                                    while (1) {
                                        my $row0 = $input_parser->getline($r->{input_fh});
                                        return unless $row0;
                                        if ($row0->[0] =~ $r->{util_args}{input_skip_until_pattern}) {
                                            # this is the header row
                                            $r->{input_header_row_count}++;
                                            return $row0;
                                        }
                                    }
                                }
                            }

                            if ($i == 0 && !$has_header) {
                                # this is the first line of a file and user
                                # specifies there is no input header. we save
                                # the line and return the generated field names
                                # instead.
                                $row0 = $input_parser->getline($r->{input_fh});
                                return unless $row0;
                                return [map { "field$_" } 1..@$row0];
                            } elsif ($i == 1 && !$has_header) {
                                # we return the saved first line
                                $r->{input_data_row_count}++ if $row0;
                                return $row0;
                            }
                            my $res = $input_parser->getline($r->{input_fh});
                            if ($res) {
                                $r->{input_header_row_count}++ if $i==0;
                                $r->{input_data_row_count}++ if $i;
                            }
                            $res;
                        };
                        $r->{code_getline} = $code_getline;

                        $i = 0;
                        while ($r->{input_row} = $code_getline->()) {
                            $i++;
                            $r->{input_rownum} = $i;
                            $r->{input_data_rownum} = $has_header ? $i-1 : $i;
                            if ($i == 1) {
                                # gather the list of fields
                                $r->{input_fields} = $r->{input_row};
                                $r->{stdin_input_fields} //= $r->{input_row} if $input_filename eq '-';
                                $r->{orig_input_fields} = $r->{input_fields};
                                $r->{input_fields_idx} = {};
                                for my $j (0 .. $#{ $r->{input_fields} }) {
                                    $r->{input_fields_idx}{ $r->{input_fields}[$j] } = $j;
                                }

                                if ($on_input_header_row) {
                                    log_trace "[csvutil] Calling on_input_header_row hook handler ...";
                                    $on_input_header_row->($r);

                                    if (delete $r->{wants_skip_file}) {
                                        log_trace "[csvutil] Handler wants to skip this file, moving on to the next file";
                                        next INPUT_FILENAME;
                                    } elsif (delete $r->{wants_skip_files}) {
                                        log_trace "[csvutil] Handler wants to skip all files, skipping all input files";
                                        last READ_CSV;
                                    }
                                }

                                # reindex the fields, should the above hook
                                # handler adds/removes fields. let's save the
                                # old fields_idx to orig_fields_idx.
                                $r->{orig_input_fields_idx} = $r->{input_fields_idx};
                                $r->{input_fields_idx} = {};
                                for my $j (0 .. $#{ $r->{input_fields} }) {
                                    $r->{input_fields_idx}{ $r->{input_fields}[$j] } = $j;
                                }

                            } else {
                                # fill up the elements of row to the number of
                                # fields, in case the row contains sparse values
                                unless (defined $r->{wants_fill_rows} && !$r->{wants_fill_rows}) {
                                    if (@{ $r->{input_row} } < @{ $r->{input_fields} }) {
                                        splice @{ $r->{input_row} }, scalar(@{ $r->{input_row} }), 0, (("") x (@{ $r->{input_fields} } - @{ $r->{input_row} }));
                                    }
                                }

                                # generate the hashref version of row if utility
                                # requires it
                                if ($r->{wants_input_row_as_hashref}) {
                                    $r->{input_row_as_hashref} = {};
                                    for my $j (0 .. $#{ $r->{input_row} }) {
                                        # ignore extraneous data fields
                                        last if $j >= @{ $r->{input_fields} };
                                        $r->{input_row_as_hashref}{ $r->{input_fields}[$j] } = $r->{input_row}[$j];
                                    }
                                }

                                if ($on_input_data_row) {
                                    log_trace "[csvutil] Calling on_input_data_row hook handler (for first data row) ..." if $r->{input_rownum} <= 2;
                                    $on_input_data_row->($r);

                                    if (delete $r->{wants_skip_file}) {
                                        log_trace "[csvutil] Handler wants to skip this file, moving on to the next file";
                                        next INPUT_FILENAME;
                                    } elsif (delete $r->{wants_skip_files}) {
                                        log_trace "[csvutil] Handler wants to skip all files, skipping all input files";
                                        last READ_CSV;
                                    }
                                }
                            }

                        } # while getline

                        # XXX actually close filehandle except stdin

                        if ($after_close_input_file) {
                            log_trace "[csvutil] Calling after_close_input_file handler ...";
                            $after_close_input_file->($r);
                            if (delete $r->{wants_skip_files}) {
                                log_trace "[csvutil] Handler wants to skip reading all file, skipping";
                                last READ_CSV;
                            }
                        }
                    } # for input_filename

                    if ($after_close_input_files) {
                        log_trace "[csvutil] Calling after_close_input_files handler ...";
                        $after_close_input_files->($r);
                    }

                } # READ_CSV

                # cleanup stash from csv-reading-related keys
                delete $r->{input_filenames};
                delete $r->{input_filenum};
                delete $r->{input_filename};
                delete $r->{input_fh};
                delete $r->{input_rownum};
                delete $r->{input_data_rownum};
                delete $r->{input_row};
                delete $r->{input_row_as_hashref};
                delete $r->{input_fields};
                delete $r->{input_fields_idx};
                delete $r->{orig_input_fields_idx};
                delete $r->{code_getline};
                delete $r->{wants_input_row_as_hashref};

                if ($after_read_input) {
                    log_trace "[csvutil] Calling after_read_input handler ...";
                    $after_read_input->($r);
                }

                # cleanup stash from csv-outputting-related keys
                delete $r->{output_num_of_files};
                delete $r->{output_filenum};
                if ($r->{output_fh}) {
                    if ($r->{output_filename} ne '-') {
                        log_info "[csvutil] Closing output file '$r->{output_filename}' ...";
                        close $r->{output_fh} or die [500, "Can't close output file '$r->{output_filename}': $!"];
                    }
                    delete $r->{output_fh};
                }
                if ($r->{util_args}{inplace}) {
                    my $output_filenum = $r->{output_filenum} // 0;

lib/App/CSVUtils.pm  view on Meta::CPAN

read. Will be reset for each CSV file.

=back

If you are outputting CSV (C<writes_csv> gen argument set to true), the following
keys will be available:

=over

=item * C<output_emitter>, a L<Text::CSV_XS> instance for output.

=item * C<output_fields>, array of str. Should be set to list of output field names. If
unset, will be set to be the same as C<input_fields>.

=item * C<output_fields_idx>, hash with field names as keys and field indexes (0-based
integer) as values. Normally you do not need to set this manually; you just
need to set C<output_fields> and this hash will be computed automatically for
you just before the first output row is outputted.

=item * C<output_filenames>, array of str.

=item * C<output_filename>, str, name of current output file.

=item * C<output_filenum>, uint, the number of the current output file, 1 being the
first file, 2 for the second, and so on.

=item * C<output_fh>, handle to the current output file.

=item * C<output_rownum>, uint. The number of rows that have been outputted (reset
after each output file).

=item * C<output_data_rownum>, uint. The number of data rows that have been outputted
(reset after each output file). This will be equal to C<input_rownum> less 1 if
input file has header.

=back

For other hook-specific keys, see the documentation for associated hook point.

I<ACCEPTING ADDITIONAL COMMAND-LINE OPTIONS/ARGUMENTS>

As mentioned above, you will get additional command-line options/arguments in
C<< $r-E<gt>{util_args} >> hashref. Some options/arguments are already added by
C<gen_csv_util>, e.g. C<input_filename> or C<input_filenames> along with
C<input_sep_char>, etc (when your utility declares C<reads_csv>),
C<output_filename> or C<output_filenames> along with C<overwrite>,
C<output_sep_char>, etc (when your utility declares C<writes_csv>).

If you want to accept additional arguments/options, you specify them in
C<add_args> (hashref, with key being Each option/argument has to be specified
first via C<add_args> (as hashref, with key being argument name and value the
argument specification as defined in L<Rinci::function>)). Some argument
specifications have been defined in L<App::CSVUtils> and can be used. See
existing utilities for examples.

I<READING CSV DATA>

To read CSV data, normally your utility would provide handler for the
C<on_input_data_row> hook and sometimes additionally C<on_input_header_row>.

I<OUTPUTTING STRING OR RETURNING RESULT>

To output string, usually you call the provided routine C<< $r-E<gt>{code_print} >>. This
routine will open the output files for you.

You can also return enveloped result directly by setting C<< $r-E<gt>{result} >>.

I<OUTPUTTING CSV DATA>

To output CSV data, usually you call the provided routine C<< $r-E<gt>{code_print_row} >>.
This routine accepts a row (arrayref or hashref). This routine will open the
output files for you when needed, as well as print header row automatically.

You can also buffer rows from input to e.g. C<< $r-E<gt>{output_rows} >>, then call
C<< $r-E<gt>{code_print_row} >> repeatedly in the C<after_read_input> hook to print all the
rows.

I<READING MULTIPLE CSV FILES>

To read multiple CSV files, you first specify C<reads_multiple_csv>. Then, you
can supply handler for C<on_input_header_row> and C<on_input_data_row> as usual.
If you want to do something before/after each input file, you can also supply
handler for C<before_open_input_file> or C<after_close_input_file>.

I<WRITING TO MULTIPLE CSV FILES>

Similarly, to write to many CSv files, you first specify C<writes_multiple_csv>.
Then, you can supply handler for C<on_input_header_row> and C<on_input_data_row>
as usual. To switch to the next file, set
C<< $r-E<gt>{wants_switch_to_next_output_file} >> to true, in which case the next call to
C<< $r-E<gt>{code_print_row} >> will close the current file and open the next file.

I<CHANGING THE OUTPUT FIELDS>

When calling C<< $r-E<gt>{code_print_row} >>, you can output whatever fields you want. By
convention, you can set C<< $r-E<gt>{output_fields} >> and C<< $r-E<gt>{output_fields_idx} >> to
let other handlers know about the output fields. For example, see the
implementation of L<csv-concat>.

This function is not exported by default, but exportable.

Arguments ('*' denotes required arguments):

=over 4

=item * B<add_args> => I<hash>

(No description)

=item * B<add_args_rels> => I<hash>

(No description)

=item * B<add_meta_props> => I<hash>

Add additional Rinci function metadata properties.

=item * B<after_close_input_file> => I<code>

(No description)

=item * B<after_close_input_files> => I<code>

(No description)

=item * B<after_read_input> => I<code>

(No description)

=item * B<before_open_input_file> => I<code>

(No description)

=item * B<before_open_input_files> => I<code>

(No description)

=item * B<before_read_input> => I<code>

(No description)

=item * B<description> => I<str>

(No description)

=item * B<examples> => I<array>

(No description)

=item * B<links> => I<array[hash]>

(No description)



( run in 1.444 second using v1.01-cache-2.11-cpan-13bb782fe5a )