view release on metacpan or search on metacpan
examples/calculate_precision_and_recall_from_file_based_relevancies_for_VSM.pl view on Meta::CPAN
$vsm->upload_document_relevancies_from_file(); # The format of the relevancy
# file must be as shown in
# relevance.txt
# Uncomment the following statement if you wish to see the list of all
# the documents relevant to each of the queries:
#$vsm->display_doc_relevancies();
# Use only one of the following statements. If you wish to carry out
# precision vs. recall analysis for LSA, comment out the first and
# uncomment the second.
$vsm->precision_and_recall_calculator('vsm');
$vsm->display_precision_vs_recall_for_queries();
$vsm->display_average_precision_for_queries_and_map();
examples/corpus/CardLayoutTest.java view on Meta::CPAN
JLabel firstLabel = new JLabel( "Frigid in the North",
firstIcon,
JLabel.CENTER );
firstLabel.setVerticalTextPosition( JLabel.BOTTOM );
firstLabel.setHorizontalTextPosition( JLabel.CENTER );
firstLabel.setBorder(
BorderFactory.createLineBorder( Color.blue ) );
cards.add( firstLabel, "frigid" );
//Card 2:
ImageIcon secondIcon = new ImageIcon( "zwthr14.gif" );
JLabel secondLabel = new JLabel( "Balmy in the South",
secondIcon,
JLabel.CENTER );
secondLabel.setVerticalTextPosition( JLabel.BOTTOM );
secondLabel.setHorizontalTextPosition( JLabel.CENTER );
secondLabel.setBorder(
BorderFactory.createLineBorder( Color.green ) );
cards.add( secondLabel, "balmy" );
//Card 3:
ImageIcon thirdIcon = new ImageIcon( "thunderstormanim.gif" );
JLabel thirdLabel = new JLabel( "Stormy In the East",
thirdIcon,
JLabel.CENTER );
thirdLabel.setVerticalTextPosition( JLabel.BOTTOM );
thirdLabel.setHorizontalTextPosition( JLabel.CENTER );
thirdLabel.setBorder(
BorderFactory.createLineBorder( Color.red ) );
examples/corpus/FlowLayoutTest.java view on Meta::CPAN
//ITEM 1:
ImageIcon firstIcon = new ImageIcon( "snowflake.gif" );
JLabel firstLabel = new JLabel( "Frigid in the North",
firstIcon,
JLabel.CENTER );
firstLabel.setVerticalTextPosition( JLabel.BOTTOM );
firstLabel.setHorizontalTextPosition( JLabel.CENTER );
contentPane.add( firstLabel );
//ITEM 2:
ImageIcon secondIcon = new ImageIcon( "zwthr14.gif" );
JLabel secondLabel = new JLabel( "Balmy in the South",
secondIcon,
JLabel.CENTER );
secondLabel.setVerticalTextPosition( JLabel.BOTTOM );
secondLabel.setHorizontalTextPosition( JLabel.CENTER );
// secondLabel.setPreferredSize(new Dimension(50, 50)); //(C)
contentPane.add( secondLabel );
//ITEM 3:
ImageIcon thirdIcon = new ImageIcon( "thunderstormanim.gif" );
JLabel thirdLabel = new JLabel( "Stormy In the East",
thirdIcon,
JLabel.CENTER );
thirdLabel.setVerticalTextPosition( JLabel.BOTTOM );
thirdLabel.setHorizontalTextPosition( JLabel.CENTER );
Border borderThirdLabel =
BorderFactory.createLineBorder( Color.blue );
examples/corpus/GridLayoutTest.java view on Meta::CPAN
//ITEM 1:
ImageIcon firstIcon = new ImageIcon( "snowflake.gif" );
JLabel firstLabel = new JLabel( "Frigid in the North",
firstIcon,
JLabel.CENTER );
firstLabel.setVerticalTextPosition( JLabel.BOTTOM );
firstLabel.setHorizontalTextPosition( JLabel.CENTER );
contentPane.add( firstLabel );
//ITEM 2:
ImageIcon secondIcon = new ImageIcon( "zwthr14.gif" );
JLabel secondLabel = new JLabel( "Balmy in the South",
secondIcon,
JLabel.CENTER );
secondLabel.setVerticalTextPosition( JLabel.BOTTOM );
secondLabel.setHorizontalTextPosition( JLabel.CENTER );
contentPane.add( secondLabel );
//ITEM 3:
ImageIcon thirdIcon = new ImageIcon( "thunderstormanim.gif" );
JLabel thirdLabel = new JLabel( "Stormy In the East",
thirdIcon,
JLabel.CENTER );
thirdLabel.setVerticalTextPosition( JLabel.BOTTOM );
thirdLabel.setHorizontalTextPosition( JLabel.CENTER );
Border borderThirdLabel =
BorderFactory.createLineBorder( Color.blue );
examples/corpus/Overload.java view on Meta::CPAN
//Overload.java
class Employee { String name; }
class Manager extends Employee { int level; }
class Test {
static void foo( Employee e1, Employee e2 ) { //first foo //(A)
System.out.println( "first foo" );
}
static void foo( Employee e, Manager m ) { //second foo //(B)
System.out.println( "second foo" );
}
static void foo( Manager m, Employee e) { //third foo //(C)
System.out.println( "third foo" );
}
public static void main( String[] args )
{
Employee emp = new Employee();
Manager man = new Manager();
foo( emp, man ); // will invoke the second foo //(D)
//foo( man, man ); // Error because it produces an //(E)
// ambiguity in overload resolution
}
}
examples/corpus/Overload2.java view on Meta::CPAN
class Employee { String name; }
class Manager extends Employee { int level; }
class Test {
// first foo:
static void foo( Employee e1,
Employee e2, double salary ) {
System.out.println( "first foo" );
}
// second foo:
static void foo( Employee e,
Manager m, int salary ) {
System.out.println( "second foo" );
}
// third foo:
static void foo( Manager m, Employee e, int salary ) {
System.out.println( "third foo" );
}
public static void main( String[] args )
{
Employee emp = new Employee();
examples/corpus/SortTiming.java view on Meta::CPAN
import java.util.*; //(A)
class Test {
public static void main( String[] args ) {
int[] arr = new int[1000000]; //(B)
for ( int i=0; i<1000000; i++ )
arr[i] = (int) ( 1000000 * Math.random() ); //(C)
long startTime = System.currentTimeMillis(); //(D)
Arrays.sort( arr ); //(E)
long diffTime = System.currentTimeMillis() - startTime; //(F)
System.out.println("Sort time in millisecs: " + diffTime);//(G)
}
}
examples/corpus_with_java_and_cpp/CardLayoutTest.java view on Meta::CPAN
JLabel firstLabel = new JLabel( "Frigid in the North",
firstIcon,
JLabel.CENTER );
firstLabel.setVerticalTextPosition( JLabel.BOTTOM );
firstLabel.setHorizontalTextPosition( JLabel.CENTER );
firstLabel.setBorder(
BorderFactory.createLineBorder( Color.blue ) );
cards.add( firstLabel, "frigid" );
//Card 2:
ImageIcon secondIcon = new ImageIcon( "zwthr14.gif" );
JLabel secondLabel = new JLabel( "Balmy in the South",
secondIcon,
JLabel.CENTER );
secondLabel.setVerticalTextPosition( JLabel.BOTTOM );
secondLabel.setHorizontalTextPosition( JLabel.CENTER );
secondLabel.setBorder(
BorderFactory.createLineBorder( Color.green ) );
cards.add( secondLabel, "balmy" );
//Card 3:
ImageIcon thirdIcon = new ImageIcon( "thunderstormanim.gif" );
JLabel thirdLabel = new JLabel( "Stormy In the East",
thirdIcon,
JLabel.CENTER );
thirdLabel.setVerticalTextPosition( JLabel.BOTTOM );
thirdLabel.setHorizontalTextPosition( JLabel.CENTER );
thirdLabel.setBorder(
BorderFactory.createLineBorder( Color.red ) );
examples/corpus_with_java_and_cpp/FlowLayoutTest.java view on Meta::CPAN
//ITEM 1:
ImageIcon firstIcon = new ImageIcon( "snowflake.gif" );
JLabel firstLabel = new JLabel( "Frigid in the North",
firstIcon,
JLabel.CENTER );
firstLabel.setVerticalTextPosition( JLabel.BOTTOM );
firstLabel.setHorizontalTextPosition( JLabel.CENTER );
contentPane.add( firstLabel );
//ITEM 2:
ImageIcon secondIcon = new ImageIcon( "zwthr14.gif" );
JLabel secondLabel = new JLabel( "Balmy in the South",
secondIcon,
JLabel.CENTER );
secondLabel.setVerticalTextPosition( JLabel.BOTTOM );
secondLabel.setHorizontalTextPosition( JLabel.CENTER );
// secondLabel.setPreferredSize(new Dimension(50, 50)); //(C)
contentPane.add( secondLabel );
//ITEM 3:
ImageIcon thirdIcon = new ImageIcon( "thunderstormanim.gif" );
JLabel thirdLabel = new JLabel( "Stormy In the East",
thirdIcon,
JLabel.CENTER );
thirdLabel.setVerticalTextPosition( JLabel.BOTTOM );
thirdLabel.setHorizontalTextPosition( JLabel.CENTER );
Border borderThirdLabel =
BorderFactory.createLineBorder( Color.blue );
examples/corpus_with_java_and_cpp/GridLayoutTest.java view on Meta::CPAN
//ITEM 1:
ImageIcon firstIcon = new ImageIcon( "snowflake.gif" );
JLabel firstLabel = new JLabel( "Frigid in the North",
firstIcon,
JLabel.CENTER );
firstLabel.setVerticalTextPosition( JLabel.BOTTOM );
firstLabel.setHorizontalTextPosition( JLabel.CENTER );
contentPane.add( firstLabel );
//ITEM 2:
ImageIcon secondIcon = new ImageIcon( "zwthr14.gif" );
JLabel secondLabel = new JLabel( "Balmy in the South",
secondIcon,
JLabel.CENTER );
secondLabel.setVerticalTextPosition( JLabel.BOTTOM );
secondLabel.setHorizontalTextPosition( JLabel.CENTER );
contentPane.add( secondLabel );
//ITEM 3:
ImageIcon thirdIcon = new ImageIcon( "thunderstormanim.gif" );
JLabel thirdLabel = new JLabel( "Stormy In the East",
thirdIcon,
JLabel.CENTER );
thirdLabel.setVerticalTextPosition( JLabel.BOTTOM );
thirdLabel.setHorizontalTextPosition( JLabel.CENTER );
Border borderThirdLabel =
BorderFactory.createLineBorder( Color.blue );
examples/corpus_with_java_and_cpp/Mixin.cc view on Meta::CPAN
void print() { Employee::print(); dept.print(); }
~Manager() {}
};
/////////////////////// class ExecutiveManager //////////////////////
// An ExecutiveManager supervises more than one department
class ExecutiveManager : public Manager {
short level;
vector<Department> departments;
public:
// Needed in the second example for type conversion
// from Manager to ExecutiveManager
ExecutiveManager()
: Manager( "", "", eUnknown ), level( 0 ) {}
ExecutiveManager( string name,
string address,
EducationLevel education,
short level )
: Manager( name,
address,
education ), level( level ) {
examples/corpus_with_java_and_cpp/MultiCustomerAccount.cc view on Meta::CPAN
//MultiCustomerAccount.cc
#include <qthread.h>
#include <cstdlib>
#include <iostream>
#include <ctime>
using namespace std;
void keepBusy( double howLongInMillisec );
QMutex mutex;
QWaitCondition cond;
class Account : public QThread {
public:
int balance;
Account() { balance = 0; }
void deposit( int dep ) {
examples/corpus_with_java_and_cpp/MultiCustomerAccount.cc view on Meta::CPAN
withdrawers[ i ] = new Withdrawer();
depositors[ i ]->start();
withdrawers[ i ]->start();
}
for ( int i=0; i < 5; i++ ) {
depositors[ i ]->wait();
withdrawers[ i ]->wait();
}
}
void keepBusy( double howLongInMillisec ) {
int ticksPerSec = CLOCKS_PER_SEC;
int ticksPerMillisec = ticksPerSec / 1000;
clock_t ct = clock();
while ( clock() < ct + howLongInMillisec * ticksPerMillisec )
;
}
examples/corpus_with_java_and_cpp/Overload.java view on Meta::CPAN
//Overload.java
class Employee { String name; }
class Manager extends Employee { int level; }
class Test {
static void foo( Employee e1, Employee e2 ) { //first foo //(A)
System.out.println( "first foo" );
}
static void foo( Employee e, Manager m ) { //second foo //(B)
System.out.println( "second foo" );
}
static void foo( Manager m, Employee e) { //third foo //(C)
System.out.println( "third foo" );
}
public static void main( String[] args )
{
Employee emp = new Employee();
Manager man = new Manager();
foo( emp, man ); // will invoke the second foo //(D)
//foo( man, man ); // Error because it produces an //(E)
// ambiguity in overload resolution
}
}
examples/corpus_with_java_and_cpp/Overload2.java view on Meta::CPAN
class Employee { String name; }
class Manager extends Employee { int level; }
class Test {
// first foo:
static void foo( Employee e1,
Employee e2, double salary ) {
System.out.println( "first foo" );
}
// second foo:
static void foo( Employee e,
Manager m, int salary ) {
System.out.println( "second foo" );
}
// third foo:
static void foo( Manager m, Employee e, int salary ) {
System.out.println( "third foo" );
}
public static void main( String[] args )
{
Employee emp = new Employee();
examples/corpus_with_java_and_cpp/RepeatInherit.cc view on Meta::CPAN
void print() { Employee::print(); dept.print(); }
~Manager(){}
};
////////////////////// class ExecutiveManager ///////////////////////
// An ExecutiveManager supervises more than one department
class ExecutiveManager : public Manager {
short level;
vector<Department> departments; // depts in charge of
public:
// no-arg const. needed in the second example for type conversion
// from Manager to ExecutiveManager:
ExecutiveManager()
: Manager( "", "", eUnknown ),
Employee( "", "", eUnknown ), //(C)
level( 0 ) {}
ExecutiveManager( string name,
string address,
EducationLevel education,
short level )
: Manager( name, address, education ),
examples/corpus_with_java_and_cpp/SortTiming.java view on Meta::CPAN
import java.util.*; //(A)
class Test {
public static void main( String[] args ) {
int[] arr = new int[1000000]; //(B)
for ( int i=0; i<1000000; i++ )
arr[i] = (int) ( 1000000 * Math.random() ); //(C)
long startTime = System.currentTimeMillis(); //(D)
Arrays.sort( arr ); //(E)
long diffTime = System.currentTimeMillis() - startTime; //(F)
System.out.println("Sort time in millisecs: " + diffTime);//(G)
}
}
examples/corpus_with_java_and_cpp/SynchedSwaps.cc view on Meta::CPAN
//SynchedSwaps.cc
#include <qthread.h>
#include <cstdlib>
#include <iostream>
#include <ctime>
using namespace std;
void keepBusy( double howLongInMillisec );
class DataObject : public QThread {
QMutex mutex;
int dataItem1;
int dataItem2;
public:
DataObject() {
dataItem1 = 50;
dataItem2 = 50;
}
examples/corpus_with_java_and_cpp/SynchedSwaps.cc view on Meta::CPAN
RepeatedSwaps t1;
RepeatedSwaps t2;
RepeatedSwaps t3;
t0.wait();
t1.wait();
t2.wait();
t3.wait();
}
void keepBusy( double howLongInMillisec ) {
int ticksPerSec = CLOCKS_PER_SEC;
int ticksPerMillisec = ticksPerSec / 1000;
clock_t ct = clock();
while ( clock() < ct + howLongInMillisec * ticksPerMillisec )
;
}
examples/retrieve_with_disk_based_LSA.pl view on Meta::CPAN
"The call to the above script generates the disk-based hashtables\n" .
"needed by the current script\n";
my @query = qw/ string getAllChars throw IOException distinct TreeMap histogram map /;
# The three databases mentioned in the next three statements are
# created by calling the script
# retrieve_with_VSM_and_also_create_disk_based_model.pl. The first of
# the databases stores the corpus vocabulary and term frequencies for
# the vocabulary words. The second database stores the term frequency
# vectors for the individual documents in the corpus. The third
# database stores the normalized document vectors. As to what is meant
# by document normalization, see the script retrieve_with_VSM.pl
my $corpus_vocab_db = "corpus_vocab_db";
my $doc_vectors_db = "doc_vectors_db";
my $normalized_doc_vecs_db = "normalized_doc_vecs_db";
my $lsa = Algorithm::VSM->new(
corpus_vocab_db => $corpus_vocab_db,
examples/retrieve_with_disk_based_VSM.pl view on Meta::CPAN
" retrieve_with_VSM_and_also_create_disk_based_model.pl\n\n" .
"on the same corpus with the following constructor options:\n\n" .
" use_idf_filter => 1, \n" .
" save_model_on_disk => 1, \n\n";
my @query = qw/ string getAllChars throw IOException distinct TreeMap histogram map /;
# The three databases mentioned in the next two statements are created
# by calling the script
# retrieve_with_VSM_and_also_create_disk_based_model.pl . The first of
# the databases stores the corpus vocabulary, the second term
# frequencies for the vocabulary words, and the third the normalized
# document vectors. As to what is meant by normalization, see the
# comments in the script retrieve_with_VSM.pl.
my $corpus_vocab_db = "corpus_vocab_db";
my $doc_vectors_db = "doc_vectors_db";
my $normalized_doc_vecs_db = "normalized_doc_vecs_db";
my $vsm = Algorithm::VSM->new(
corpus_vocab_db => $corpus_vocab_db,
doc_vectors_db => $doc_vectors_db,
lib/Algorithm/VSM.pm view on Meta::CPAN
my $dir = rel2abs( shift );
my $current_dir = cwd;
chdir $dir or die "Unable to change directory to $dir: $!";
foreach ( glob "*" ) {
if ( -d and !(-l) ) {
$self->_scan_directory( $_ );
chdir $dir
or die "Unable to change directory to $dir: $!";
} elsif (-r _ and
-T _ and
-M _ > 0.00001 and # modification age is at least 1 sec
!( -l $_ ) and
$self->ok_to_filetype($_) ) {
$self->_scan_file_for_rels($_) if $self->{_scan_dir_for_rels};
$self->_scan_file($_) unless $self->{_corpus_vocab_done};
$self->_construct_doc_vector($_) if $self->{_corpus_vocab_done};
}
}
chdir $current_dir;
}
lib/Algorithm/VSM.pm view on Meta::CPAN
my @Precision_values = ();
my @Recall_values = ();
my $rank = 1;
while ($rank < @retrieved + 1) {
my $index = 1;
my @retrieved_at_rank = ();
while ($index <= $rank) {
push @retrieved_at_rank, $ranked_retrievals{$index};
$index++;
}
my $intersection =set_intersection(\@retrieved_at_rank,
\@relevant_set);
my $precision_at_rank = @retrieved_at_rank ?
(@$intersection / @retrieved_at_rank) : 0;
push @Precision_values, $precision_at_rank;
my $recall_at_rank = @$intersection / @relevant_set;
push @Recall_values, $recall_at_rank;
$rank++;
}
print "\n\nFor query $query, precision values: @Precision_values\n"
if $self->{_debug};
print "\nFor query $query, recall values: @Recall_values\n"
if $self->{_debug};
$self->{_precision_for_queries}->{$query} = \@Precision_values;
my $avg_precision;
$avg_precision += $_ for @Precision_values;
lib/Algorithm/VSM.pm view on Meta::CPAN
sub display_precision_vs_recall_for_queries {
my $self = shift;
die "You must first invoke precision_and_recall_calculator function"
unless scalar(keys %{$self->{_precision_for_queries}});
print "\n\nDisplaying precision and recall values for different queries:\n\n";
foreach my $query (sort
{get_integer_suffix($a) <=> get_integer_suffix($b)}
keys %{$self->{_avg_precision_for_queries}}) {
print "\n\nQuery $query:\n";
print "\n (The first value is for rank 1, the second value at rank 2, and so on.)\n\n";
my @precision_vals = @{$self->{_precision_for_queries}->{$query}};
@precision_vals = map {sprintf "%.3f", $_} @precision_vals;
print " Precision at rank => @precision_vals\n";
my @recall_vals = @{$self->{_recall_for_queries}->{$query}};
@recall_vals = map {sprintf "%.3f", $_} @recall_vals;
print "\n Recall at rank => @recall_vals\n";
}
print "\n\n";
}
lib/Algorithm/VSM.pm view on Meta::CPAN
sub return_index_of_last_value_above_threshold {
my $pdl_obj = shift;
my $size = $pdl_obj->getdim(0);
my $threshold = shift;
my $lower_bound = $pdl_obj->slice(0)->sclr * $threshold;
my $i = 0;
while ($i < $size && $pdl_obj->slice($i)->sclr > $lower_bound) {$i++;}
return $i-1;
}
sub set_intersection {
my $set1 = shift;
my $set2 = shift;
my %hset1 = map {$_ => 1} @$set1;
my @common_elements = grep {$hset1{$_}} @$set2;
return @common_elements ? \@common_elements : [];
}
sub get_integer_suffix {
my $label = shift;
$label =~ /(\d*)$/;
lib/Algorithm/VSM.pm view on Meta::CPAN
C<retrieve_with_VSM_and_also_create_disk_based_model.pl> shows how you can do that.
Other changes in 1.60 include a slight reorganization of the scripts in the
C<examples> directory. Most scripts now do not by default store their models in
disk-based hash tables. This reorganization is reflected in the description of the
C<examples> directory in this documentation. The basic logic of constructing VSM and
LSA models and how these are used for retrievals remains unchanged.
Version 1.50 incorporates a couple of new features: (1) You now have the option to
split camel-cased and underscored words for constructing your vocabulary set; and (2)
Storing the VSM and LSA models in database files on the disk is now optional. The
second feature, in particular, should prove useful to those who are using this module
for large collections of documents.
Version 1.42 includes two new methods, C<display_corpus_vocab_size()> and
C<write_corpus_vocab_to_file()>, for those folks who deal with very large datasets.
You can get a better sense of the overall vocabulary being used by the module for
file retrieval by examining the contents of a dump file whose name is supplied as an
argument to C<write_corpus_vocab_to_file()>.
Version 1.41 downshifts the required version of the PDL module. Also cleaned up are
the dependencies between this module and the submodules of PDL.
lib/Algorithm/VSM.pm view on Meta::CPAN
If you would like to compare in your own script any two documents in the corpus, you
can call
my $similarity = $vsm->pairwise_similarity_for_docs("filename_1", "filename_2");
or
my $similarity = $vsm->pairwise_similarity_for_normalized_docs("filename_1", "filename_2");
Both these calls return a number that is the dot product of the two document vectors
normalized by the product of their magnitudes. The first call uses the regular
document vectors and the second the normalized document vectors.
=item B<precision_and_recall_calculator():>
After you have created or obtained the relevancy judgments for your test queries, you
can make the following call to calculate C<Precision@rank> and C<Recall@rank>:
$vsm->precision_and_recall_calculator('vsm');
or
$vsm->precision_and_recall_calculator('lsa');