DBIx-BatchChunker

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

    If either of the min/max statements don't return any ID data, this
    method will return 0.

 execute

        my $batch_chunker = DBIx::BatchChunker->new(
            # ...other attributes for calculate_ranges...
    
            dbi_connector => $conn,          # DBIx::Connector::Retry object
            stmt          => $do_stmt,       # INSERT/UPDATE/DELETE $stmt with BETWEEN placeholders
            ### OR ###
            dbi_connector => $conn,          # DBIx::Connector::Retry object
            stmt          => $select_stmt,   # SELECT $stmt with BETWEEN placeholders
            count_stmt    => $count_stmt,    # SELECT COUNT $stmt to be used for min_chunk_percent; optional
            coderef       => $coderef,       # called code that does the actual work
            ### OR ###
            rs      => $account_rs,          # base ResultSet, which gets filtered with -between later on
            id_name => 'account_id',         # can be looked up if not provided
            coderef => $coderef,             # called code that does the actual work
            ### OR ###
            coderef => $coderef,             # DIY database work; just pass the $start/$end IDs
    
            ### Optional but recommended ###
            sleep             => 0.25, # number of seconds to sleep each chunk; defaults to 0
            process_past_max  => 1,    # use this if processing the whole table
            single_rows       => 1,    # does $coderef get a single $row or the whole $chunk_rs / $stmt
            min_chunk_percent => 0.25, # minimum row count of chunk size percentage; defaults to 0.5 (or 50%)
            target_time       => 5,    # target runtime for dynamic chunk size scaling; default is 5 seconds
            max_runtime       => 12 * 60 * 60, # stop processing after 12 hours
    
            progress_name => 'Updating Accounts',  # easier than creating your own progress_bar
    
            ### Optional ###
            progress_bar     => $progress,  # defaults to "Processing $source_name" bar
            verbose          => 1,          # displays timing stats on each chunk
        );
    
        $batch_chunker->execute if $batch_chunker->calculate_ranges;

    Applies the configured DB changes in chunks. Runs through the loop,
    processing a statement handle, ResultSet, and/or coderef as it goes.
    Each loop iteration processes a chunk of work, determined by
    "chunk_size".

    The "calculate_ranges" method should be run first to fill in "min_id"
    and "max_id". If either of these are missing, the function will assume
    "calculate_ranges" couldn't find them and warn about it.

    More details can be found in the "Processing Modes" and "ATTRIBUTES"
    sections.

PRIVATE METHODS

 _process_block

    Runs the DB work and passes it to the coderef. Its return value
    determines whether the block should be processed or not.

 _process_past_max_checker

    Checks to make sure the current endpoint is actually the end, by
    checking the database. Its return value determines whether the block
    should be processed or not.

    See "process_past_max".

 _chunk_count_checker

    Checks the chunk count to make sure it's properly sized. If not, it
    will try to shrink or expand the current chunk (in chunk_size
    increments) as necessary. Its return value determines whether the block
    should be processed or not.

    See "min_chunk_percent".

    This is not to be confused with the "_runtime_checker", which adjusts
    chunk_size after processing, based on previous run times.

 _runtime_checker

    Stores the previously processed chunk's runtime, and then adjusts
    chunk_size as necessary.

    See "target_time".

 _increment_progress

    Increments the progress bar.

 _print_chunk_status

    Prints out a standard chunk status line, if "verbose" is enabled. What
    it prints is generally uniform, but it depends on the processing
    action. Most of the data is pulled from "loop_state".

CAVEATS

 Big Number Support

    If the module detects that the ID numbers are no longer safe for
    standard Perl NV storage, it will automatically switch to using
    Math::BigInt and Math::BigFloat for big number support. If any blessed
    numbers are already being used to define the attributes, this will also
    switch on the support.

 String-based IDs

    If you're working with VARCHAR types or other string-based IDs to
    represent integers, these may be subject to whatever string-based
    comparison rules your RDBMS uses when calculating with MIN/MAX or using
    BETWEEN. Row counting and chunk size scaling will try to compensate,
    but will be mixing string-based comparisons from the RDBMS and
    Perl-based integer math.

    Using the CAST function may help, but it may also cause critical
    indexes to be ignored, especially if the function is used on the
    left-hand side against the column. Strings with the exact same length
    may be safe from comparison weirdness, but YMMV.

    Non-integer inputs from ID columns, such as GUIDs or other alphanumeric
    strings, are not currently supported. They would have to be converted



( run in 0.656 second using v1.01-cache-2.11-cpan-13bb782fe5a )