Lingua-EN-NameParse
view release on metacpan or search on metacpan
Applied correct capitalization to single possessive word, such as French's
Added Dalle, dela and dall' to list of Italian surname prefixes
Added San to list of Spanish surname prefixes
Detect reserved words, such as Pty Ltd in pre parse stage
1.16 24 Sep 2001:
Minor additions to README file
1.15 25 Jul 2001:
Added more complete list to surname_prefs.txt
Allowed for a surname prefix of Dell', as in Dell'Arte
Added case_all_reversed method to return name in the format of
surname, initials and/or given_names, useful for alphabetical sorting
Length of given name set to at least 2 characters for name type John_A_Smith
Names such as "Al B Jones" now parse correctly
Length of given name set to at least 2 characters for name type John_Adam_Smith
MANIFEST
README
LICENCE
Changes
Makefile.PL
examples/demo.pl
lib/Lingua/EN/NameParse.pm
lib/Lingua/EN/NameParse/Grammar.pm
surname_prefs.txt
t/main.t
t/rules.t
t/pod.t
t/pod-coverage.t
META.yml
META.json
lib/Lingua/EN/NameParse.pm view on Meta::CPAN
=head2 case_all
$correct_casing = $name->case_all;
The C<case_all> method converts the first letter of each component to
capitals and the remainder to lower case, with the following exceptions-
initials remain capitalised
surname spelling such as MacNay-Smith, O'Brien and Van Der Heiden are preserved
- see C<surname_prefs.txt> for user defined exceptions
A complete definition of the capitalising rules can be found by studying
the case_surname function.
The method returns the entire cased name as text.
=head2 case_all_reversed
$correct_casing = $name->case_all_reversed;
lib/Lingua/EN/NameParse.pm view on Meta::CPAN
=head2 case_surname
$correct_casing = case_surname("DE SILVA-MACNAY" [,$lc_prefix]);
C<case_surname> is a stand alone function that does not require a name
object. The input is a text string. An optional input argument controls the
casing rules for prefix portions of a surname, as described above in the
C<lc_prefix> section.
The output is a string converted to the correct casing for surnames.
See C<surname_prefs.txt> for user defined exceptions
This function is useful when you know you are only dealing with names that
do not have initials like "Mr John Jones". It is much faster than the case_all
method, but does not understand context, and cannot detect errors on strings
that are not personal names.
=head2 surname_prefs.txt
Some surnames can have more than one form of valid capitalisation, such as
MacQuarie or Macquarie. Where the user wants to specify one form as the default,
a text file called surname_prefs.txt should be created and placed in the same
location as the NameParse module. The text file should contain one surname per
line, in the capitalised form you want, such as
Macquarie
MacHado
NameParse will still operate if the file does not exist
=head2 salutation
lib/Lingua/EN/NameParse.pm view on Meta::CPAN
$component_value .= ',';
}
push(@cased_name_reversed,$component_value);
}
}
}
return(join(' ',@cased_name_reversed));
}
#-------------------------------------------------------------------------------
# The user may specify their own preferred spelling for surnames.
# These should be placed in a text file called surname_prefs.txt
# in the same location as the module itself.
BEGIN
{
# Obtain the full path to NameParse module, defined in the %INC hash.
my $prefs_file_location = $INC{"Lingua/EN/NameParse.pm"};
# Now substitute the name of the preferences file
$prefs_file_location =~ s/NameParse\.pm$/surname_prefs.txt/;
if ( open(PREFERENCES_FH,"<$prefs_file_location") )
{
my @surnames = <PREFERENCES_FH>;
foreach my $name ( @surnames )
{
chomp($name);
# Build hash, lower case name is key for case insensitive
# comparison, while value holds the actual capitalisation
$Lingua::EN::surname_preferences{lc($name)} = $name;
}
close(PREFERENCES_FH);
lib/Lingua/EN/NameParse.pm view on Meta::CPAN
sub case_surname
{
my ($surname,$lc_prefix) = @_;
unless ($surname)
{
return('');
}
# If the user has specified a preferred capitalisation for this
# surname in the surname_prefs.txt, it should be returned now.
if ($Lingua::EN::surname_preferences{lc($surname)} )
{
return($Lingua::EN::surname_preferences{lc($surname)});
}
# Lowercase everything
$surname = lc($surname);
# Now uppercase first letter of every word. By checking on word boundaries,
# we will account for apostrophes (D'Angelo) and hyphenated names
( run in 2.264 seconds using v1.01-cache-2.11-cpan-5a3173703d6 )