streaming results from the CPAN

Pg-Reindex

view release on metacpan or search on metacpan


C<Pg::Reindex> builds new indexes using C<CREATE INDEX CONCURRENTLY>. Then it
starts a transaction for each index in which it drops the old index and
renames the new one.

It handles normal indexes and C<PRIMARY KEY>, C<FOREIGN KEY> and C<UNIQUE>
constraints.

=head2 Streaming replication and throttling

Before creating the next index, the streaming replication lag is checked to
be below a certain limit. If so, nothing special happens and the index is
built.

Otherwise, C<rebuild> waits for the replicas to catch up. When the lag
drops under a second limit, the C<rebuild> does not immediately continue.
Instead it waits for another 30 seconds and checks the lag every second
within that period. Only if the lag stays below the limit for the whole
time, execution is continued. This grace period is to deal with the fact
that a wal sender process may suddenly disappear and reappear after a
few seconds. Without the grace period the program may encounter a false

README.pod view on Meta::CPAN


validate C<FOREIGN KEY> constraints or leave them C<NOT VALID>. Default
it to validate.

=item --[no]dryrun

don't modify the database but print the essential SQL statements.

=item --high-txn-lag

the upper limit streaming replicas may lag behind in bytes.

Default is 10,000,000.

=item --low-txn-lag

the lower limit in bytes when execution may be continued after it has been
interrupted due to exceeding C<high_txn_lag>.

Default is 100,000

lib/Pg/Reindex.pm view on Meta::CPAN

sub throttle {
    return if $opt_dryrun;
    state $q = $dbh->prepare( <<'SQL', { pg_async => PG_ASYNC } );
SELECT coalesce(max(pg_xlog_location_diff(pg_current_xlog_location(), r.flush_location)), 0)
  FROM pg_stat_replication r
SQL

    my ($xlog_diff) = @{ query( '', $q )->[0] };

    if ( $xlog_diff > $opt_throttle_on ) {
        lg "streaming lag = $xlog_diff ==> pausing\n";
    LOOP: {
            do {
                select undef, undef, undef, 1;    ## no critic
                ($xlog_diff) = @{ query( '', $q )->[0] };
            } while ( $xlog_diff > $opt_throttle_off );

            # sleep for another 30 sec and check every second the lag.
            # sometimes the wal sender process disconnects and reconnects
            # a moment later. In that case we may have fallen below the
            # throttle limit simply because we checked at the wrong time.
            for ( my $i = 0; $i < 30; $i++ ) {
                select undef, undef, undef, 1;    ## no critic
                ($xlog_diff) = @{ query( '', $q )->[0] };
                redo LOOP if $xlog_diff > $opt_throttle_off;
            }
        }
        lg "streaming lag = $xlog_diff -- continuing\n";
    }
    return;
}

sub next_index {
    my @list = query '', <<'SQL';
WITH wl AS (
    UPDATE reindex.worklist
       SET status='in progress'
     WHERE idx=(SELECT idx

lib/Pg/Reindex.pm view on Meta::CPAN

        die $err;
    };
    return;
}

sub reindex {
    my ( $oid, $nspname, $quoted_nspname, $idxname, $quoted_idxname, $idxdef,
        $size, $opt_validate )
        = @_;

    throttle;    # wait for streaming replicas to catch up

    lg "Rebuilding Index $quoted_nspname.$quoted_idxname\n";

    my @log_id;
    @log_id = query '', $oid, <<'SQL' unless $opt_dryrun;
INSERT INTO reindex.log(tstmp, nspname, tblname, idxname, sz_before, status)
SELECT now(), n.nspname, tc.relname, ic.relname, pg_catalog.pg_relation_size(i.indexrelid::regclass), 'started'
  FROM pg_catalog.pg_index i
  JOIN pg_catalog.pg_class ic ON i.indexrelid=ic.oid
  JOIN pg_catalog.pg_class tc ON i.indrelid=tc.oid

lib/Pg/Reindex.pm view on Meta::CPAN


C<Pg::Reindex> builds new indexes using C<CREATE INDEX CONCURRENTLY>. Then it
starts a transaction for each index in which it drops the old index and
renames the new one.

It handles normal indexes and C<PRIMARY KEY>, C<FOREIGN KEY> and C<UNIQUE>
constraints.

=head2 Streaming replication and throttling

Before creating the next index, the streaming replication lag is checked to
be below a certain limit. If so, nothing special happens and the index is
built.

Otherwise, C<rebuild> waits for the replicas to catch up. When the lag
drops under a second limit, the C<rebuild> does not immediately continue.
Instead it waits for another 30 seconds and checks the lag every second
within that period. Only if the lag stays below the limit for the whole
time, execution is continued. This grace period is to deal with the fact
that a wal sender process may suddenly disappear and reappear after a
few seconds. Without the grace period the program may encounter a false

lib/Pg/Reindex.pm view on Meta::CPAN


validate C<FOREIGN KEY> constraints or leave them C<NOT VALID>. Default
it to validate.

=item --[no]dryrun

don't modify the database but print the essential SQL statements.

=item --high-txn-lag

the upper limit streaming replicas may lag behind in bytes.

Default is 10,000,000.

=item --low-txn-lag

the lower limit in bytes when execution may be continued after it has been
interrupted due to exceeding C<high_txn_lag>.

Default is 100,000

( run in 0.265 second using v1.01-cache-2.11-cpan-4d50c553e7e )