alive results from the CPAN

Directory-Transactional

view release on metacpan or search on metacpan

lib/Directory/Transactional.pm view on Meta::CPAN

	my %args = @args;

	my ( $coderef, $commit, $rollback, $code_args ) = @args{qw(body commit rollback args)};

	ref $coderef eq 'CODE' or croak '$coderef must be a CODE reference';

	$code_args ||= [];

	$self->txn_begin;

	my @result;

	my $wantarray = wantarray; # gotta capture, eval { } has its own

	my ( $success, $err ) = do {
		local $@;

		my $success = eval {
			if ( $wantarray ) {
				@result = $coderef->(@$code_args);
			} elsif( defined $wantarray ) {
				$result[0] = $coderef->(@$code_args);
			} else {
				$coderef->(@$code_args);
			}

			$commit && $commit->();
			$self->txn_commit;

			1;
		};

		( $success, $@ );
	};

	if ( $success ) {
		return wantarray ? @result : $result[0];
	} else {
		my $rollback_exception = do {
			local $@;
			eval { $self->txn_rollback; $rollback && $rollback->() };
			$@;
		};

		if ($rollback_exception) {
			croak "Transaction aborted: $err, rollback failed: $rollback_exception";
		}

		die $err;
	}
}

sub txn_begin {
	my ( $self, @args ) = @_;

	my $txn;

	if ( my $p = $self->_txn ) {
		# this is a child transaction

		croak "Can't txn_begin if an auto transaction is still alive" if $p->auto_handle;

		$txn = Directory::Transactional::TXN::Nested->new(
			parent  => $p,
			manager => $self,
		);
	} else {
		# this is a top level transaction
		$txn = Directory::Transactional::TXN::Root->new(
			@args,
			manager => $self,
			( $self->global_lock ? (
				# when global_lock is set, take an exclusive lock on the root dir
				# non global lockers take a shared lock on it
				global_lock => $self->_get_flock( File::Spec->catfile( $self->_locks, ".lock" ), LOCK_EX)
			) : () ),
		);
	}

	$self->_txn($txn);

	return;
}

sub _pop_txn {
	my $self = shift;

	my $txn = $self->_txn or croak "No active transaction";

	if ( $txn->isa("Directory::Transactional::TXN::Nested") ) {
		$self->_txn( $txn->parent );
	} else {
		$self->_clear_txn;
	}

	return $txn;
}

sub txn_commit {
	my $self = shift;

	my $txn = $self->_txn;

	my $changed = $txn->changed;

	if ( $changed->size ) {
		if ( $txn->isa("Directory::Transactional::TXN::Root") ) {
			# commit the work, backing up in the backup dir

			# first take a lock on the backup dir
			# this is used to detect crashed transactions
			# if the dir exists but isn't locked then the transaction crashed
			my $txn_lockfile = $txn->backup . ".lock";
			my $txn_lock = $self->_get_lock( $txn_lockfile, LOCK_EX );

			{
				# during a commit the work dir is considered dirty
				# this flag is set until check_dirty clears it
				my $dirty_lock = $self->set_dirty;

				$txn->create_backup_dir;

lib/Directory/Transactional.pm view on Meta::CPAN


Conversely, under C<flock> mode C<global_lock> B<is> compatible with fine
grained locking.

=back

=head1 ACID GUARANTEES

ACID stands for atomicity, consistency, isolation and durability.

Transactions are atomic (using locks), consistent (a recovery mode is able to
restore the state of the directory if a process crashed while comitting a
transaction), isolated (each transaction works in its own temporary directory),
and durable (once C<txn_commit> returns a software crash will not cause the
transaction to rollback).

=head1 TRANSACTIONAL PROTOCOL

This section describes the way the ACID guarantees are met:

When the object is being constructed a nonblocking attempt to get an exclusive
lock on the global shared lock file using L<File::NFSLock> or C<flock> is made.

If this lock is successful this means that this object is the only active
instance, and no other instance can access the directory for now.

The work directory's state is inspected, any partially comitted transactions
are rolled back, and all work files are cleaned up, producing a consistent
state.

At this point the exclusive lock is dropped, and a shared lock on the same file
is taken, which will be retained for the lifetime of the object.

Each transaction (root or nested) gets its own work directory, which is an
overlay of its parent.

All write operations are performed in the work directory, while read operations
walk up the tree.

Aborting a transaction consists of simply removing its work directory.

Comitting a nested transaction involves overwriting its parent's work directory
with all the changes in the child transaction's work directory.

Comitting a root transaction to the root directory involves moving aside every
file from the root to a backup directory, then applying the changes in the work
directory to the root, renaming the backup directory to a work directory, and
then cleaning up the work directory and the renamed backup directory.

If at any point in the root transaction commit work is interrupted, the backup
directory acts like a journal entry. Recovery will rollback this transaction by
restoring all the renamed backup files. Moving the backup directory into the
work directory signifies that the transaction has comitted successfully, and
recovery will clean these files up normally.

If C<crash_detection> is enabled (the default) when reading any file from the
root directory (shared global state) the system will first check for crashed
commits.

Crashed commits are detected by means of lock files. If the backup directory is
locked that means its comitting process is still alive, but if a directory
exists without a lock then that process has crashed. A global dirty flag is
maintained to avoid needing to check all the backup directories each time.

If the commit is still running then it can be assumed that the process
comitting it still has all of its exclusive locks so reading from the root
directory is safe.

=head1 DEADLOCKS

This module does not implement deadlock detection. Unfortunately maintaing a
lock table is a delicate and difficult task, so I doubt I will ever implement
it.

The good news is that certain operating systems (like HPUX) may implement
deadlock detection in the kernel, and return C<EDEADLK> instead of just
blocking forever.

If you are not so lucky, specify a C<timeout> or make sure you always take
locks in the same order.

The C<global_lock> flag can also be used to prevent deadlocks entirely, at the
cost of concurrency. This provides fully serializable level transaction
isolation with no possibility of serialization failures due to deadlocks.

There is no pessimistic locking mode (read-modify-write optimized) since all
paths leading to a file are locked for reading. This mode, if implemented,
would be semantically identical to C<global_lock> but far less efficient.

In the future C<fcntl> based locking may be implemented in addition to
C<flock>. C<EDEADLK> seems to be more widely supported when using C<fcntl>.

=head1 LIMITATIONS

=head2 Auto-Commit

If you perform any operation outside of a transaction and C<auto_commit> is
enabled a transaction will be created for you.

For operations like C<rename> or C<readdir> which do not return resource the
transaction is comitted immediately.

Operations like C<open> or C<file_stream> on the other create a transaction
that will be alive as long as the return value is alive.

This means that you should not leak filehandles when relying on autocommit.

Opening a new transaction when an automatic one is already opened is an error.

Note that this resource tracking comes with an overhead, especially on Perl
5.8, so even if you are only performing read operations it is reccomended that
you operate within the scope of a real transaction.

=head2 Open Filehandles

One filehandle is required per every lock when using fine grained locking.

For large transactions it is reccomended you set C<global_lock>, which is like
taking an exclusive lock on the root directory.

C<global_lock> also performs better, but causes long wait times if multiple
processes are accessing the same database but not the same data. For web
applications C<global_lock> should probably be off for better concurrency.

=head1 ATTRIBUTES

=over 4

=item root

This is the managed directory in which transactional semantics will be maintained.

This can be either a string path or a L<Path::Class::Dir>.

=item _work

This attribute is named with a leading underscore to prevent thoughtless
modification (if you have two workers accessing the same directory
simultaneously but the work dir is different they will conflict and not even
know it).

The default work directory is placed under root, and is named C<.txn_work_dir>.

The work dir's parent must be writable, because a lock file needs to be created
next to it (the workdir name with C<.lock> appended).

=item nfs

If true (defaults to false), L<File::NFSLock> will be used for all locks
instead of C<flock>.

Note that on my machine the stress test reliably B<FAILS> with
L<File::NFSLock>, due to a race condition (exclusive write lock granted to two
writers simultaneously), even on a local filesystem. If you specify the C<nfs>
flag make sure your C<link> system call is truly atomic.

=item global_lock

If true instead of using fine grained locking, a global write lock is obtained
on the first call to C<txn_begin> and will be kept for as long as there is a
running transaction.

This is useful for avoiding deadlocks (there is no deadlock detection code in
the fine grained locking).

This flag is automatically set if C<nfs> is set.

=item timeout

If set will be used to specify a time limit for blocking calls to lock.

If you are experiencing deadlocks it is reccomended to set this or
C<global_lock>.

=item auto_commit

If true (the default) any operation not performed within a transaction will
cause a transaction to be automatically created and comitted.

Transactions automatically created for operations which return things like
filehandles will stay alive for as long as the returned resource does.

=item crash_detection

IF true (the default), all read operations accessing global state (the root
directory) will first ensure that the global directory is not dirty.

If the perl process crashes while comitting the transaction but other
concurrent processes are still alive, the directory is left in an inconsistent
state, but all the locks are dropped. When C<crash_detection> is enabled ACID
semantics are still guaranteed, at the cost of locking and stating a file for
each read operation on the global directory.

If you disable this then you are only protected from system crashes (recovery
will be run on the next instantiation of L<Directory::Transactional>) or soft
crashes where the crashing process has a chance to run all its destructors
properly.

=back

=head1 METHODS

=head2 Transaction Management

=over 4

=item txn_do $code, %callbacks

Executes C<$code> within a transaction in an C<eval> block.

If any error is thrown the transaction will be rolled back. Otherwise the
transaction is comitted.

C<%callbacks> can contain entries for C<commit> and C<rollback>, which are
called when the appropriate action is taken.

=item txn_begin

Begin a new transaction. Can be called even if there is already a running
transaction (nested transactions are supported).

=item txn_commit

Commit the current transaction. If it is a nested transaction, it will commit
to the parent transaction's work directory.

=item txn_rollback

Discard the current transaction, throwing away all changes since the last call
to C<txn_begin>.

=back

=head2 Lock Management

=over 4

=item lock_path_read $path, $no_parent

=item lock_path_write $path, $no_parent

Lock the resource at C<$path> for writing or reading.

By default the ancestors of C<$path> will be locked for reading to (from
outermost to innermost).

The only way to unlock a resource is by comitting the root transaction, or
aborting the transaction in which the resource was locked.

( run in 1.046 second using v1.01-cache-2.11-cpan-75ffa21a3d4 )