MARC-Errorchecks

 view release on metacpan or  search on metacpan

lib/MARC/BBMARC.pm  view on Meta::CPAN


Prints an entire record in human-readable form, using as_formatted2().
This puts each field on a single line and uses @ (at) as subfield 
delimiter instead of _ (underscore).
Based on MARC::Record::as_formatted().

=cut

sub recas_formatted() {

	use MARC::Record;

	my $self = shift;
	    
	my @lines = ("LDR " . ($self->{_leader} || ""));
	for my $field (@{$self->{_fields}}) {
		push(@lines, $field->as_formatted2());
	}

	return join("\n", @lines);

} # recas_formatted



##########################
##########################
##########################

=head2 skipget()

Returns a raw MARC record string or undef.

=cut

sub skipget {

	use MARC::File;
	my $self = shift;
	$self->{recnum}++;

	my $rec = $self->_next();

	return $rec ? $rec : undef;

}

##########################
##########################
##########################

=head2 updated_record_array()

Note: Creates an array of control numbers (001) from input file.
Use with merge marc script. Call to initialize updated record array variable prior to entering loop.
Prompts for updated record file.
Prints running count of records based on counting_print function. Works only with USMARC input files.

=cut

sub updated_record_array {

	use MARC::File::USMARC;
	my @updatedrecarray;
####################################
# To do: test abstracted input file call ##
####################################
	my $inputfile = shift;
	unless ($inputfile) {
		print ("What is the updated record file?:");
		$inputfile = <>;
		chomp $inputfile;
		#remove double quotes inserted when drag-dropping from Windows
		$inputfile =~ s/^\"(.*)\"$/$1/;
	}
	#initialize $decodedfile as new usmarc file object
	my $decodedfile = MARC::File::USMARC->in( "$inputfile" );
	my $recordno = 0;

	while ( my $record = $decodedfile->next()) { 

		my $controlnumb = $record->field('001')->as_string();

		$updatedrecarray[$recordno] = $controlnumb;
		$recordno++;
		MARC::BBMARC::counting_print ($recordno);

	} #while

	return @updatedrecarray;

} #updated_record_array

##########################
##########################
##########################

=head2 read_controlnos()

Accepts passed filename as arguement.
If nothing is passed, asks for file path/name.
Reads each line of file, and pushes it onto array, @controlnumberarray, which is returned.
Lines in the file should contain only control number.

Since it does not do anything to the line it reads, this subroutine can be used to read lines from a file and store them in an array.

To do: Modify existing scripts to clean control number, replacing spaces with underscores.
Regex-ify control number to be (3 char) - (8 digit) - (space). 

=cut

sub read_controlnos {

	#get passed-in filename
	my $controlinputfile = shift;
	unless ($controlinputfile) {
		print ("Where is the file of control numbers?\n");
		print ("\(enter blank line if none\): ");
		$controlinputfile = <>;
		chomp $controlinputfile;
		#remove double quotes inserted when drag-dropping from Windows

lib/MARC/BBMARC.pm  view on Meta::CPAN

####

sub counting_print {
	my $modcount = shift;
	if ($modcount % MOD_INTERVAL == 0) {
		print "passing $modcount\n";
	} #if
} #counting_print

##########################
##########################
##########################

=head2 startstop_time()

Start stop time is called when a program starts 
or finishes, to see how long it takes to complete.
Returns time in hour:min:second format, 
with seconds<10 being single digit. (to fix later)

=cut

###################################

sub startstop_time {

	my $dayornight;
	my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
	$year+=1900;
	if ($hour >12) {$hour-=12; $dayornight="p.m.";}
	elsif ($hour == 12) {$dayornight="p.m.";}
	else {$dayornight="a.m.";}
	if ($min < 10) {$min = "0".$min;}
	if ($sec < 10) {$sec = "0".$sec;}
	return "$hour:$min:$sec $dayornight\n"
}

##########################
##########################
##########################

=head2 updated_record_hash()

Note: Creates an hash of control numbers (001) and associated raw MARC data from input file.
Use with compare records script. Call to initialize updated record array variable prior to entering loop.
Prompts for updated record file if the name (or path) of one is not passed in.
Prints running count of records based on counting_print function. Works only with USMARC input files.

=head1 NOTE WARNING (on updated_record_hash)

This may be very memory intensive as it stores raw MARC for each record in the updated (first) file, with its associated control number.
40000+ records (43815K on disk) take approximately 102,192K+ to read in and then dereference.
YOU HAVE BEEN WARNED!!!

=head2 TO DO (on updated_record_hash)

Reduce memory usage, probably by learning how to tie hash to file instead of storing everything in memory.

=cut

sub updated_record_hash {

	use MARC::Batch;
	my %updatedrechash;

	#retrieve file name if one was passed
	my $inputfile = shift;
	#otherwise get file name
	unless ($inputfile) {
		print ("What is the updated record file?:");
		$inputfile = <>;
		chomp $inputfile;
		#remove double quotes inserted when drag-dropping from Windows
		$inputfile =~ s/^\"(.*)\"$/$1/;
	}

	#initialize $batch as new MARC::Batch object
	my $batch = MARC::Batch->new('USMARC', "$inputfile");
	my $recordno = 0;

	while (my $record = $batch->next()) {

		#get control number for the record
		my $controlnumb = $record->field('001')->as_string();

		#use $controlnumb as hash key to full raw MARC string 
		$updatedrechash{$controlnumb} = $record->as_usmarc();
$recordno++;
		MARC::BBMARC::counting_print ($recordno);

	} #while

###testing ###
	print "$recordno records read\n";
###/testing ###

	return \%updatedrechash;

} #updated_record_hash

##########################
##########################
##########################

=head2 as_array

Add-on method to MARC::Field. Breaks MARC::Field into a flat array of subfield code and subfield data pairs.
Based on example 9 of the MARC::Doc::Tutorial.

head2 Example (as_array)

my $field043 = MARC::Field->new('043', '', '', 'a' => 'n-us---', 'a' => 'e-uk---', 'a' => 'a-th---' );

my $field043_arrayref = $field043->as_array(); 
my @field043_array = @$field043arrayref;

# @field043_array is: ('a', 'n-us---', 'a', 'e-uk---', 'a', 'a-th---')

=head2 TO DO (as_array)

Add ability to optionally pass in regex to find in subfields, returning positions of the matches (in a second array ref).



( run in 2.328 seconds using v1.01-cache-2.11-cpan-437f7b0c052 )