ACME-QuoteDB
view release on metacpan or search on metacpan
lib/ACME/QuoteDB/LoadDB.pm view on Meta::CPAN
# nope, ok, add them
if (not $attr_id) { # attribution record does not already exist,
# create new entry
if ($self->{write_db}) {
$attr_id = Attr->insert({
name => $self->get_record('name'),
});
}
}
my $catg_ids = ();
if ($self->{write_db}) {
my ($catg) = $self->get_record('catg');
if (! ref $catg){ # 'single' value
my $catg_id = $self->_get_id_if_catg_exist($catg);
if (!$catg_id) {
# category does not already exist,
# create new entry
$catg_id = Catg->insert({catg => $catg});
}
push @{$catg_ids}, $catg_id;
} # support multi catg
elsif (ref $catg eq 'ARRAY'){
foreach my $c (@{$catg}){
my $catg_id = $self->_get_id_if_catg_exist($c);
if (!$catg_id) { # category does not already exist,
# create new entry
$catg_id = Catg->insert({catg => $c});
}
push @{$catg_ids}, $catg_id;
}
}
}
$self->_display_vals_if_verbose;
if ($self->{write_db}) {
my $qid = Quote->insert({
attr_id => $attr_id,
quote => $self->get_record('quote'),
source => $self->get_record('source'),
rating => $self->get_record('rating')
}) or croak $!;
if ($qid) {
my $id;
foreach my $cid (@{$catg_ids}){
$id = QuoteCatg->insert({
quot_id => $qid,
catg_id => $cid,
}) or croak $!;
}
}
}
# confirmation?
# TODO add a test for failure
if ($self->{write_db} and not $attr_id) {croak 'db write not successful'}
#$self->set_record(undef);
$self->{record} = {};
$self->_reset_orig_args;
if ($self->{write_db}) {
$self->success(1);
}
return $self->success;
}
sub _reset_orig_args {
my ($self) = @_;
$self->{record}->{rating} = $self->{orig_args}->{rating};
$self->{record}->{name} = $self->{orig_args}->{attr_source};
$self->{record}->{source} = $self->{orig_args}->{attr_source};
if (ref $self->{orig_args}->{category} eq 'ARRAY') {
foreach my $c (@{$self->{orig_args}->{category}}){
push @{$self->{record}->{catg}}, $c;
}
}
else {
$self->{record}->{catg} = $self->{orig_args}->{category};
}
}
sub success {
my ($self, $flag) = @_;
$self->{success} ||= $flag;
return $self->{success};
};
sub _display_vals_if_verbose {
my ($self) = @_;
if ($self->{verbose}){
#print 'Quote: ', $self->get_record('quote'),"\n";
#print 'Source: ', $self->get_record('source'),"\n";
#print 'Category: ',$self->get_record('catg'),"\n";
#print 'Rating: ', $self->get_record('rating'),"\n";
print Dumper $self->{record};
}
return $self;
}
#sub create_db {
# my ($self) = @_;
#
# if ($self->{db} and $self->{host}) {
# $self->create_db_mysql();
# }
#}
sub create_db_tables {
my ($self) = @_;
if ($self->{db} and $self->{host}) {
#$self->create_db_mysql();
$self->create_db_tables_mysql();
}
else {
create_db_tables_sqlite();
}
return $self;
}
lib/ACME/QuoteDB/LoadDB.pm view on Meta::CPAN
=head1 DESCRIPTION
This module is part of L<ACME::QuoteDB>. This is a Database loader, it
takes (quotes) data and loads into a database
(currently L<sqlite3 or mysql|/'CONFIGURATION AND ENVIRONMENT'>),
which is then accessed by L<ACME::QuoteDB>.
There are several ways to get quote data into the db via this loader:
(There are more aimed towards 'batch' operations, i.e load a bunch of
records quickly)
=over 4
=item 1
* csv file (pre determined format)
pros: quick and easy to load.
cons: getting the quotes data into the correct format need by this module
=item 2
* any source.
One can take quote data from any source, override
L<ACME::QuoteDB::LoadDB/dbload> loader methods to populate a record
and write it to the db.
pros: can get any quote data into the db.
cons: you supply the method. depending on the complexity of the data
source and munging required this will take longer then the other
methods.
=back
=head3 load from csv file
The pre defined csv file format is:
format of file is as follows: (headers)
"Quote", "Attribution Name", "Attribution Source", "Category", "Rating"
for example:
"Quote", "Attribution Name", "Attribution Source", "Category", "Rating"
"I hope this has taught you kids a lesson: kids never learn.","Chief Wiggum","The Simpsons","Humor",9
"Sideshow Bob has no decency. He called me Chief Piggum. (laughs) Oh wait, I get it, he's all right.","Chief Wiggum","The Simpsons","Humor",8
my $load_db = ACME::QuoteDB::LoadDB->new({
file => dirname(__FILE__).'/data/simpsons_quotes.csv',
file_format => 'csv',
});
$load_db->data_to_db;
if (!$load_db->success){print 'failed'}
=head3 load from any source
If those dont catch your interest, ACME::QuoteDB::LoadDB is sub-classable,
so one can extract data anyway they like and populate the db themselves.
(there is a test that illustrates overriding the stub method, 'dbload')
you need to populate a record data structure:
$self->set_record(quote => q{}); # mandatory
$self->set_record(name => q{}); # mandatory
$self->set_record(source => q{}); # optional but useful
$self->set_record(catg => q{}); # optional but useful
$self->set_record(rating => q{}); # optional but useful
# then to write the record you call
$self->write_record;
NOTE: this is a one record at a time operation, so one would perform
this within a loop. there is no bulk write operation currently.
=head1 OVERVIEW
You have a collection of quotes (adages/sayings/quips/epigrams, etc) for
whatever reason, you use these quotes for whatever reason, you want to
access these quotes in a variety of ways,...
This module is part of L<ACME::QuoteDB>.
This is a Database loader, it takes data (quotes) and loads into a database,
which is then accessed by L<ACME::QuoteDB>.
See L<ACME::QuoteDB>.
=head1 USAGE
General usage, csv/tsv file in the expected format loaded to the database
my $load_db = ACME::QuoteDB::LoadDB->new({
file => '/home/me/data/sorta_funny_quotes.tsv',
file_format => 'tsv',
delimiter => "\t",
# provide a attr_source for all (if not in data)
# data is used first, if not defined use below
attr_source => 'Things Randomly Overheard',
# provide a category for all (if not in data)
category => 'Humor',
# provide a rating for all
rating => 5, # scale 1-10
});
$load_db->data_to_db;
if (!$load_db->success){print 'failed'}
Also see t/01-load_quotes.t included with the distribution.
(available from the CPAN if not included on your system)
=head1 SUBROUTINES/METHODS
This is an Object Oriented module. There is no proceedural interface.
lib/ACME/QuoteDB/LoadDB.pm view on Meta::CPAN
if directory, full path is needed, can supply a basic glob type filter.
example:
{ file => '/home/me/data/simpsons_quotes.csv' }
{ dir => '/home/me/data/*.csv' }
=item file_format - required
can be one of: 'csv', 'tsv', 'custom', or 'html'
if 'html' or 'custom' you must supply the method for parsing.
(see tests for examples)
example:
{ file_format => 'csv' }
=item delimiter - optional, default is a comma for csv
csv/tsv options tested: comma(,) and tab(\t)
'html' - not applicable
example:
{ delimiter => "\t" }
=item category - optional, extracted from data if exists, otherwise will use what you
specify
TODO one quote to multiple categories
=item attr_source - extracted from data if exists, otherwise will use what you
specify
example:
{attr_source => 'The Simpsons'}
=item file_encoding - optional
Files being loaded are assumed to be utf8 encoded. if utf8 flag is not detected,
falls back to latin1 (iso-8859-1). If neither of these is correct, set this
option to the encoding your file is in.
=back
=head4 Operation Related Parameters
=over 4
=item dry_run - optional
do not write to the database. Use with verbose flag to see what would have beed
written.
This can be helpful for testing the outcome of Loading results.
i.e. like to confirm that the parsing of your data is correct
example:
{
dry_run => 1,
verbose => 1
}
=item verbose - optional
display to STDOUT what is being done
This can be helpful for testing quotes extraction from file parsing
example:
{verbose => 1}
=item create_db - optional (boolean)
L<ACME::QuoteDB::LoadDB> default behaviour is to always assume there is a
database and append new data to that. (It is usually only needed the first
time one load's data)
setting this parameter to a true value will create a new database.
(so while this is an optional param, it is required at least once ;)
B<NOTE: it is not intelligent, if you hand it a populated database,
it will happily overwrite all data>
B<AGAIN: setting this param will destroy the current database, creating a new
empty one>
example:
{create_db => 1}
=back
=head2 data_to_db
takes the data input provided to new, process' it and writes to the database.
should appropriatly blow up if not successful
=head2 dbload_from_csv
takes a csv file (in our defined format) as an argument, parses it and writes
the data to the database. (uses L<Text::CSV> with pure perl parser)
utf-8 safe. (opens file as utf8)
will croak with message if not successful
=head2 dbload
if your file format is set to 'html' or 'custom' you must
define this method to do your parsing in a sub class.
Load from html is not supported because there are too many
ways to represt the data. (same with 'custom')
(see tests for examples - there is a test for loading a 'fortune' file format)
One can subclass ACME::QuoteDB::LoadDB and override dbload,
to do our html parsing
=head2 debug_record
dump record (show what is set on the internal data structure)
e.g. Data::Dumper
=head2 set_record
only needed it one plans to sub-class this module.
otherwise, is transparent in usage.
if you are sub-classing this module, you would have to populate
this record. (L</write_record> knows about/uses this data structure)
possible fields consist of:
$self->set_record(quote => q{});
$self->set_record(rating => q{});
$self->set_record(name => q{});
$self->set_record(source => q{});
$self->set_record(catg => q{});
currently can only set one attribute at a time.
ie. you cant do this:
$self->set_record(
name => $name,
source => $source
);
# or this even
$self->set_record({
name => $name,
source => $source
});
=head2 get_record
only useful it one plans to sub-class this module.
otherwise, is transparent in usage.
if you are sub-classing this module, you would have to populate
this record. [see L</set_record>]
(L</write_record> knows about/uses this data structure)
possible fields consist of:
$self->get_record('quote');
$self->get_record('rating');
$self->get_record('name');
$self->get_record('source');
$self->get_record('catg');
=head2 success
indicates that the database load was successfull
is undef on failure or if trying a L</dry_run>
=head2 write_record
takes the data structure 'record' '$self->get_record'
(which must exist). checks if attribution name ($self->get_record('name')) exists,
if so, uses existing attribution name, otherwsie creates a new one
Load from html is not supported because there are too many
ways to represt the data. (see tests for examples)
One can subclass ACME::QuoteDB::LoadDB and override dbload,
to do our html parsing
=head2 create_db_tables
create an empty quotes database (with correct tables).
(usually only performed the first time you load data)
B<NOTE: will overwrite ALL existing data>
Set 'create_db' parameter (boolean) to a true value upon instantiation
to enable.
The default action is to assume the database (and tables) exist and just
append new L<ACME::QuoteDB::LoadDB> loads to that.
=begin comment
keep pod coverage happy.
# Coverage for ACME::QuoteDB::LoadDB is 71.4%, with 3 naked subroutines:
# Catg
# Quote
# Attr
# QuoteCatg
pod tests incorrectly state, Catg, Quote and Attr are subroutines, well they
are,... (as aliases) but are imported into here, not defined within
TODO: explore the above (is this a bug, if so, who's?, version effected,
create use case, etc)
=head2 Attr
=head2 Catg
=head2 Quote
=head2 QuoteCatg
=head2 QDBI
=end comment
=begin comment
These methods are more or less private.
I may use them in another modules but You don't need to use or
know about them, so I will obfuscate them here
=head2 create_db_tables_sqlite
=head2 create_db_tables_mysql
=end comment
=head1 DIAGNOSTICS
lib/ACME/QuoteDB/LoadDB.pm view on Meta::CPAN
It is trivial to add support for others
see: L<LOADING QUOTES|ACME::QuoteDB/LOADING QUOTES>
=back
=head1 DEPENDENCIES
L<Carp>
L<Data::Dumper>
L<criticism> (pragma - enforce Perl::Critic if installed)
L<version>(pragma - version numbers)
L<aliased>
L<Test::More>
L<DBD::SQLite>
L<DBI>
L<Class::DBI>
L<File::Basename>
L<Readonly>
L<Module::Build>
=head1 INCOMPATIBILITIES
none known of
=head1 SEE ALSO
man fortune (unix/linux)
L<Fortune>
L<fortune>
L<ACME::QuoteDB>
=head1 AUTHOR
David Wright, C<< <david_v_wright at yahoo.com> >>
=head1 BUGS AND LIMITATIONS
Please report any bugs or feature requests to C<bug-acme-quotedb-loaddb at rt.cpan.org>, or through
the web interface at L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=ACME-QuoteDB::LoadDB>. I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.
=head1 SUPPORT
You can find documentation for this module with the perldoc command.
perldoc ACME::QuoteDB::LoadDB
You can also look for information at:
=over 4
=item * RT: CPAN's request tracker
L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=ACME-QuoteDB::LoadDB>
=item * AnnoCPAN: Annotated CPAN documentation
L<http://annocpan.org/dist/ACME-QuoteDB::LoadDB>
=item * CPAN Ratings
L<http://cpanratings.perl.org/d/ACME-QuoteDB::LoadDB>
=item * Search CPAN
L<http://search.cpan.org/dist/ACME-QuoteDB::LoadDB/>
=back
=head1 ACKNOWLEDGEMENTS
The construction of this module was guided by:
Perl Best Practices - Conway
Test Driven Development
Object Oriented Programming
Gnu is Not Unix
vim
Debian Linux
Mac OSX
The collective wisdom and code of The CPAN
=head1 LICENSE AND COPYRIGHT
Copyright 2009 David Wright, all rights reserved.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
=cut
1; # End of ACME::QuoteDB::LoadDB
( run in 0.706 second using v1.01-cache-2.11-cpan-8f98c5d2c55 )