Lingua-EN-NameParse

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

    Applied correct capitalization to single possessive word, such as French's
    Added Dalle, dela and dall' to list of Italian surname prefixes
    Added San to list of Spanish surname prefixes
    Detect reserved words, such as Pty Ltd in pre parse stage


1.16 24 Sep 2001:
    Minor additions to README file

1.15 25 Jul 2001:
    Added more complete list to surname_prefs.txt

    Allowed for a surname prefix of Dell', as in Dell'Arte

    Added case_all_reversed method to return name in the format of
    surname, initials and/or given_names, useful for alphabetical sorting

    Length of given name set to at least 2 characters for name type John_A_Smith
    Names such as "Al B Jones" now parse correctly

    Length of given name set to at least 2 characters for name type John_Adam_Smith

MANIFEST  view on Meta::CPAN

MANIFEST
README
LICENCE
Changes
Makefile.PL
examples/demo.pl
lib/Lingua/EN/NameParse.pm
lib/Lingua/EN/NameParse/Grammar.pm
surname_prefs.txt
t/main.t
t/rules.t
t/pod.t
t/pod-coverage.t
META.yml
META.json

lib/Lingua/EN/NameParse.pm  view on Meta::CPAN


=head2 case_all

    $correct_casing = $name->case_all;

The C<case_all> method converts the first letter of each component to
capitals and the remainder to lower case, with the following exceptions-

   initials remain capitalised
   surname spelling such as MacNay-Smith, O'Brien and Van Der Heiden are preserved
   - see C<surname_prefs.txt> for user defined exceptions

A complete definition of the capitalising rules can be found by studying
the case_surname function.

The method returns the entire cased name as text.

=head2 case_all_reversed

    $correct_casing = $name->case_all_reversed;

lib/Lingua/EN/NameParse.pm  view on Meta::CPAN

=head2 case_surname

   $correct_casing = case_surname("DE SILVA-MACNAY" [,$lc_prefix]);

C<case_surname> is a stand alone function that does not require a name
object. The input is a text string. An optional input argument controls the
casing rules for prefix portions of a surname, as described above in the
C<lc_prefix> section.

The output is a string converted to the correct casing for surnames.
See C<surname_prefs.txt> for user defined exceptions

This function is useful when you know you are only dealing with names that
do not have initials like "Mr John Jones". It is much faster than the case_all
method, but does not understand context, and cannot detect errors on strings
that are not personal names.


=head2 surname_prefs.txt

Some surnames can have more than one form of valid capitalisation, such as
MacQuarie or Macquarie. Where the user wants to specify one form as the default,
a text file called surname_prefs.txt should be created and placed in the same
location as the NameParse module. The text file should contain one surname per
line, in the capitalised form you want, such as

   Macquarie
   MacHado

NameParse will still operate if the file does not exist

=head2 salutation

lib/Lingua/EN/NameParse.pm  view on Meta::CPAN

                    $component_value .= ',';
                }
                push(@cased_name_reversed,$component_value);
            }
        }
    }
    return(join(' ',@cased_name_reversed));
}
#-------------------------------------------------------------------------------
# The user may specify their own preferred spelling for surnames.
# These should be placed in a text file called surname_prefs.txt
# in the same location as the module itself.

BEGIN
{
   # Obtain the full path to NameParse module, defined in the %INC hash.
   my $prefs_file_location = $INC{"Lingua/EN/NameParse.pm"};
   # Now substitute the name of the preferences file
   $prefs_file_location =~ s/NameParse\.pm$/surname_prefs.txt/;

   if ( open(PREFERENCES_FH,"<$prefs_file_location") )
   {
      my @surnames = <PREFERENCES_FH>;
      foreach my $name ( @surnames )
      {
         chomp($name);
         # Build hash, lower case name is key for case insensitive
         # comparison, while value holds the actual capitalisation
         $Lingua::EN::surname_preferences{lc($name)} = $name;
      }
      close(PREFERENCES_FH);

lib/Lingua/EN/NameParse.pm  view on Meta::CPAN

sub case_surname
{
    my ($surname,$lc_prefix) = @_;

    unless ($surname)
    {
        return('');
    }

    # If the user has specified a preferred capitalisation for this
    # surname in the surname_prefs.txt, it should be returned now.
    if ($Lingua::EN::surname_preferences{lc($surname)} )
    {
        return($Lingua::EN::surname_preferences{lc($surname)});
    }

    # Lowercase everything
    $surname = lc($surname);

    # Now uppercase first letter of every word. By checking on word boundaries,
    # we will account for apostrophes (D'Angelo) and hyphenated names



( run in 2.264 seconds using v1.01-cache-2.11-cpan-5a3173703d6 )