App-CSVUtils

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN


        - Add options: --input-skip-num-lines, --input-skip-until-pattern.

        [cli csv-concat]

        - [doc][internal] Add more code comments on how csv_concat works.


1.035   2025-01-14  Released-By: PERLANCAR; Urgency: medium

        - [cli csv-concat] Add --overlay mode & --overwrite-fields option.


1.034   2024-02-02  Released-By: PERLANCAR; Urgency: medium

        - [ux] (Re)add csv2csv as alias for csv-csv.


1.033   2023-09-06  Released-By: PERLANCAR; Urgency: medium

        - Add utility: csv-sort-fields-by-spec.

lib/App/CSVUtils/csv_concat.pm  view on Meta::CPAN


    col1,col2,col4,col3
    1,2,
    3,4,
    ,a,b
    ,c,d
    ,e,f
    ,,,X
    ,,,Y

When `--overlay` option is enabled, the result will be:

    col1,col2,col4,col3
    1,2,b,X
    3,4,d,Y
    ,e,f,

When `--overlay` as well as `--overwrite-fields` option are enabled, the result
will be:

    col1,col2,col4,col3
    1,a,b,X
    3,c,d,Y
    ,e,f,

Keywords: join, merge, overlay

MARKDOWN
    add_args => {
        overlay => {
            summary => 'Whether to overlay rows from second and subsequent CSV files to the first',
            schema => 'bool*',
            description => <<'MARKDOWN',

By default, rows from the second CSV file will be added after all the rows from
the first CSV are added, and so on. However, when this option is enabled, the
rows the second and subsequent CSV files will be added together (overlaid). See
the utility's example for an illustration.

See also the `--overwrite-fields` option.

MARKDOWN
        },
        overwrite_fields => {
            summary => 'Whether fields from subsequent CSV files should overwrite existing fields from previous CSV files',
            schema => 'bool*',
            description => <<'MARKDOWN',

When in overlay mode (`--overlay`), by default the value for a field is
retrieved from the first CSV file that has the field. With `--overwrite-fields`
option enabled, the value will be retrieved from the last CSV that has the
field. See the utility's example for an illustration.

MARKDOWN
        },
    },
    tags => ['category:combining', 'join', 'merge'],

    reads_multiple_csv => 1,

lib/App/CSVUtils/csv_concat.pm  view on Meta::CPAN

                my $field = $input_fields->[$j];
                unless (grep {$field eq $_} @{ $r->{output_fields} }) {
                    push @{ $r->{output_fields} }, $field;
                    $r->{output_fields_idx}{$field} = $#{ $r->{output_fields} };
                }
            }
        }

        my $csv = $r->{input_parser};

        if ($r->{util_args}{overlay}) {

            my $overwrite_fields = $r->{util_args}{overwrite_fields};
            my $output_fields_idx = $r->{output_fields_idx};
            while (1) {
                my $has_not_eof;
                my $combined_row = [("") x @{ $r->{output_fields} }];
                my %seen_fields;
                for my $i (0 .. $#{ $r->{all_input_fh} }) {
                    my $fh = $r->{all_input_fh}[$i];

lib/App/CSVUtils/csv_concat.pm  view on Meta::CPAN


 col1,col2,col4,col3
 1,2,
 3,4,
 ,a,b
 ,c,d
 ,e,f
 ,,,X
 ,,,Y

When C<--overlay> option is enabled, the result will be:

 col1,col2,col4,col3
 1,2,b,X
 3,4,d,Y
 ,e,f,

When C<--overlay> as well as C<--overwrite-fields> option are enabled, the result
will be:

 col1,col2,col4,col3
 1,a,b,X
 3,c,d,Y
 ,e,f,

Keywords: join, merge, overlay

This function is not exported.

Arguments ('*' denotes required arguments):

=over 4

=item * B<inplace> => I<true>

Output to the same file as input.

lib/App/CSVUtils/csv_concat.pm  view on Meta::CPAN

=item * B<output_tsv> => I<bool>

Inform that output file is TSV (tab-separated) format instead of CSV.

This is like C<--input-tsv> option but for output instead of input.

Overriden by C<--output-sep-char>, C<--output-quote-char>, C<--output-escape-char>
options. If one of those options is specified, then C<--output-tsv> will be
ignored.

=item * B<overlay> => I<bool>

Whether to overlay rows from second and subsequent CSV files to the first.

By default, rows from the second CSV file will be added after all the rows from
the first CSV are added, and so on. However, when this option is enabled, the
rows the second and subsequent CSV files will be added together (overlaid). See
the utility's example for an illustration.

See also the C<--overwrite-fields> option.

=item * B<overwrite> => I<bool>

Whether to override existing output file.

=item * B<overwrite_fields> => I<bool>

Whether fields from subsequent CSV files should overwrite existing fields from previous CSV files.

When in overlay mode (C<--overlay>), by default the value for a field is
retrieved from the first CSV file that has the field. With C<--overwrite-fields>
option enabled, the value will be retrieved from the last CSV that has the
field. See the utility's example for an illustration.


=back

Returns an enveloped result (an array).

First element ($status_code) is an integer containing HTTP-like status code

script/csv-concat  view on Meta::CPAN

=head1 VERSION

This document describes version 1.036 of csv-concat (from Perl distribution App-CSVUtils), released on 2025-02-04.

=head1 SYNOPSIS

B<csv-concat> B<L<--help|/"--help, -h, -?">> (or B<L<-h|/"--help, -h, -?">>, B<L<-?|/"--help, -h, -?">>)

B<csv-concat> B<L<--version|/"--version, -v">> (or B<L<-v|/"--version, -v">>)

B<csv-concat> [B<L<--debug|/"--debug">>|B<L<--log-level|/"--log-level=s">>=I<level>|B<L<--quiet|/"--quiet">>|B<L<--trace|/"--trace">>|B<L<--verbose|/"--verbose">>] [B<L<--format|/"--format=s">>=I<name>|B<L<--json|/"--json">>] [B<L<--inplace|/"--inpla...

=head1 DESCRIPTION

Example, concatenating this CSV:

 col1,col2
 1,2
 3,4

and:

script/csv-concat  view on Meta::CPAN


 col1,col2,col4,col3
 1,2,
 3,4,
 ,a,b
 ,c,d
 ,e,f
 ,,,X
 ,,,Y

When C<--overlay> option is enabled, the result will be:

 col1,col2,col4,col3
 1,2,b,X
 3,4,d,Y
 ,e,f,

When C<--overlay> as well as C<--overwrite-fields> option are enabled, the result
will be:

 col1,col2,col4,col3
 1,a,b,X
 3,c,d,Y
 ,e,f,

Keywords: join, merge, overlay

=head1 OPTIONS

C<*> marks required options.

=head2 Main options

=over

=item B<--input-skip-num-lines>=I<s>

script/csv-concat  view on Meta::CPAN

This is an alternative to the C<--input-skip-num-lines> and can be useful if you
have a CSV files (usually some generated reports, sometimes converted from
spreadsheet) that have additional header lines or info before the CSV header
row.

With C<--input-skip-num-lines>, you skip a fixed number of lines. With this
option, rows will be skipped until the first field matches the specified regex
pattern.


=item B<--overlay>

Whether to overlay rows from second and subsequent CSV files to the first.

By default, rows from the second CSV file will be added after all the rows from
the first CSV are added, and so on. However, when this option is enabled, the
rows the second and subsequent CSV files will be added together (overlaid). See
the utility's example for an illustration.

See also the C<--overwrite-fields> option.


=item B<--overwrite-fields>

Whether fields from subsequent CSV files should overwrite existing fields from previous CSV files.

When in overlay mode (C<--overlay>), by default the value for a field is
retrieved from the first CSV file that has the field. With C<--overwrite-fields>
option enabled, the value will be retrieved from the last CSV that has the
field. See the utility's example for an illustration.


=back

=head2 Input options

=over

t/01-basic.t  view on Meta::CPAN

    $res = App::CSVUtils::csv_convert_to_hash::csv_convert_to_hash(input_filename=>"$dir/1.csv", rownum=>10);
    is_deeply($res, [200,"OK",{f1=>undef, f2=>undef, f3=>undef}], "result 3");
};

subtest csv_concat => sub {
    my ($res, $stdout);

    require App::CSVUtils::csv_concat;
    $stdout = capture_stdout { $res = App::CSVUtils::csv_concat::csv_concat(input_filenames=>["$dir/1.csv","$dir/2.csv","$dir/4.csv"]) };
    is($stdout, qq(f1,f2,f3,F3\n1,2,3,\n4,5,6,\n7,8,9,\n1,,,\n2,,,\n3,,,\n1,3,,2\n4,6,,5\n), "output");
    $stdout = capture_stdout { $res = App::CSVUtils::csv_concat::csv_concat(input_filenames=>["$dir/1.csv","$dir/2.csv","$dir/4.csv"], overlay=>1) };
    is($stdout, qq(f1,f2,f3,F3\n1,2,3,2\n4,5,6,5\n7,8,9,\n), "output");
    $stdout = capture_stdout { $res = App::CSVUtils::csv_concat::csv_concat(input_filenames=>["$dir/1.csv","$dir/2.csv","$dir/4.csv"], overlay=>1, overwrite_fields=>1) };
    is($stdout, qq(f1,f2,f3,F3\n1,3,3,2\n4,6,6,5\n3,8,9,\n), "output");
};

subtest csv_select_fields => sub {
    my ($res, $stdout);

    require App::CSVUtils::csv_select_fields;

    $res = App::CSVUtils::csv_select_fields::csv_select_fields(input_filename=>"$dir/1.csv", include_fields=>["f1", "f4"]);
    is($res->[0], 400, "specifying unknown field -> error");



( run in 0.498 second using v1.01-cache-2.11-cpan-49f99fa48dc )