Graphics-Skullplot
view release on metacpan or search on metacpan
lib/Graphics/Skullplot/ClassifyColumns.pm view on Meta::CPAN
=cut
# TODO revise these before shipping
our $VERSION = '0.02';
my $DEBUG = 1;
=head1 SYNOPSIS
use Graphics::Skullplot::ClassifyColumns;
my $cc = Graphics::Skullplot::ClassifyColumns->new( data => $data );
my $plot_cols =
$cc->classify_columns_simple( { indie_count => $indie_count, } );
=head1 DESCRIPTION
Graphics::Skullplot::ClassifyColumns is a stripped down version
of an old experimental module I was developing I called Data::Classify.
I expect to go back to that project and develop a more elaborate
system of plug-ins to target different kinds of databases and so on,
most likely named Table::TypeInference.
This particular module just needs a "classify_columns_simple" routine
that works well enough to figure out how to plot some data via
ggplot2 in R (i.e. the "Graphics::Skullplot" project).
=cut
use 5.10.0;
use strict;
use warnings;
use Carp;
use Data::Dumper;
use Scalar::Classify qw();
=over
=item new
Creates a new Graphics::Skullplot::ClassifyColumns object.
Takes a hashref as an argument, with named fields identical
to the names of the object attributes. These attributes are:
=over
=item data
A required field, columns of data as an array of array references,
with a header in the first row.
=back
=cut
# Example attribute:
# has is_loop => ( is => 'rw', isa => Int, default => 0 );
# Tempted to use Mouse over Moo so I can do my usual "isa => 'Int'"
has data => ( is => 'ro', isa => ArrayRef );
has patterns => ( is => 'ro', isa => HashRef, builder => "define_regxeps" );
# $DB::single = 1;
=item classify_columns_simple
Note: here "simple" might be thought of as "stub":
This does the simplest possible categorization using only
a single numeric hint for the number of independent fields.
The presumption here is the incoming data is organized like
the output of a typical sql group by select, x-axis in the
first column a number of columns of dependent data as the
end, and (possibly) a certain number of categorical variables
(ones with a small number of allowed values) in-between.
This returns a hash indicating how different columns should be
handled in the plotting stage, the keys are:
x (rename: indie_x )
y but just for when there's only one dependent
gb_cats
dep_fields (rename: dependents_y }
Example usage:
my $cc = Graphics::Skullplot::ClassifyColumns->new( data => $data );
my $opt = { indie_count => 1, };
my $plot_cols_href =
$cc->classify_columns_simple( $opt );
=cut
sub classify_columns_simple {
my $self = shift;
my $opt = shift;
my $indie_count = $opt->{ indie_count } // 1;
my %field_data; # return values
my $dependent_requested = $opt->{ dependent_requested };
my $independent_requested = $opt->{ independent_requested };
my $data = $self->data;
my @header = @{ $data->[0] };
# when we're told what to do there's no need to guess
if ( $dependent_requested && $independent_requested ) {
# TODO might be better to just use the empty set
# my @gb_cats = grep{ !/^$dependent_requested$/ } grep{ !/^$independent_requested$/ } @header;
my @gb_cats = ();
%field_data =
( indie_x => $independent_requested,
y => $dependent_requested, # redundant with dependents_y
gb_cats => [ @gb_cats ],
dependents_y => [ $dependent_requested ],
);
( run in 1.748 second using v1.01-cache-2.11-cpan-39bf76dae61 )