AI-MicroStructure
view release on metacpan or search on metacpan
bin/micro-dict view on Meta::CPAN
stop=$(perl -MAI::MicroStructure::WordBlacklist -E "my \$s=AI::MicroStructure::WordBlacklist::getStopWords('de'); my @s = keys %\$s; print join('|',@s);")
IFS=$'\n';
$cmd $1 | tr A-Z a-z | # Convert to lowercase.
tr ' ' '_' | # New: change spaces to newlines.
#tr -cd '\012[a-z][0-9]' | # Get rid of everything
#+ non-alphanumeric (in orig. script).
tr -c '\012a-z' '\012' | # Rather than deleting non-alpha
egrep -v '^#' | # Delete lines starting with hashmark.
egrep -v "^[ ]*([A-Za-z][A-Za-z]|[A-Za-z])$" | egrep -v "^$" | egrep -v -i "^ (denkbarer|ganze|bez|ver�ffentlichtes|uns�gliches|ungew�hnliche|vollstaendig|erstem|Inf.|titel|unsaeglichem|beforehand|denkbares|yours|contains|gedurft|seithe...
stop=$(perl -MAI::MicroStructure::WordBlacklist -E "my \$s=AI::MicroStructure::WordBlacklist::getStopWords('de'); my @s = keys %\$s; print join('|',@s);")
cat /tmp/micro-dict.tmp | sort -n | egrep -v "^.*.[\ ].*.[1-9][\:][\ ][\ ]($stop)";
#if [ ! "$(echo "$stop" | egrep -i zzzzzzzzzzzz)" ]; then echo cool; fi
lib/AI/MicroStructure/WordBlacklist.pm view on Meta::CPAN
use strict;
use warnings;
use Exporter;
our @ISA = qw(Exporter);
our %EXPORT_TAGS = ( 'all' => [ qw( getStopWords ) ] );
our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );
sub getStopWordsSmall{
my @search = ("a","a's","able","about","above","according","accordingly","across","actually","after","afterwards","again","against","ain't","all","allow","allows","almost","alone","along","already","also","although","always","am","among","amongst","a...
return @search;
}
sub getStopWords {
if ( @_ and $_[0] eq 'UTF-8' ) {
# adding U0 causes the result to be flagged as UTF-8
my %stoplist = map { ( pack("U0a*", $_), 1 ) } qw(
a able about above according accordingly across actually after afterwards again against aint all allow allows almost alone along already also although always am among amongst an and another any anybody anyhow anyone anything anyway anyways anywhere a...
b be became because become becomes becoming been before beforehand behind being believe below beside besides best better between beyond both brief but by
c came can cannot cant cant cause causes certain certainly changes clearly cmon co com come comes concerning consequently consider considering contain containing contains corresponding could couldnt course cs currently
d definitely described despite did didnt different do does doesnt doing done dont down downwards during
e each edu eg eight either else elsewhere enough entirely especially et etc even ever every everybody everyone everything everywhere ex exactly example except
f far few fifth first five followed following follows for former formerly forth four from further furthermore
g get gets getting given gives go goes going gone got gotten greetings
h had hadnt happens hardly has hasnt have havent having he hello help hence her here hereafter hereby herein heres hereupon hers herself hes hi him himself his hither hopefully how howbeit however
i id ie if ignored ill im immediate in inasmuch inc indeed indicate indicated indicates inner insofar instead into inward is isnt it itd itll its its itself ive
j just k keep keeps kept know known knows
l last lately later latter latterly least less lest let lets like liked likely little look looking looks ltd
m mainly many may maybe me mean meanwhile merely might more moreover most mostly much must my myself
n name namely nd near nearly necessary need needs neither never nevertheless new next nine no nobody non none noone nor normally not nothing novel now nowhere
o obviously of off often oh ok okay old on once one ones only onto or other others otherwise ought our ours ourselves out outside over overall own
p particular particularly per perhaps placed please plus possible presumably probably provides
q que quite qv r rather rd re really reasonably regarding regardless regards relatively respectively right
s said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several shall she should shouldnt since six so some somebody somehow someone something sometime sometimes some...
t take taken tell tends th than thank thanks thanx that thats thats the their theirs them themselves then thence there thereafter thereby therefore therein theres theres thereupon these they theyd theyll theyre theyve think third this thorough thorou...
u un under unfortunately unless unlikely until unto up upon us use used useful uses using usually uucp
v value various very via viz vs
w want wants was wasnt way we wed welcome well well went were were werent weve what whatever whats when whence whenever where whereafter whereas whereby wherein wheres whereupon wherever whether which while whither who whoever whole whom whos whose w...
lib/AI/MicroStructure/WordBlacklist.pm view on Meta::CPAN
example except f far few fifth first five followed following follows
for former formerly forth four from further furthermore g get gets
getting given gives go goes going gone got gotten greetings h had
hadn't happens hardly has hasn't have haven't having he he's hello
help hence her here here's hereafter hereby herein hereupon hers
herself hi him himself his hither hopefully how howbeit however
i i'd i'll i'm i've ie if ignored immediate in inasmuch inc indeed
indicate indicated indicates inner insofar instead into inward is
isn't it it'd it'll it's its itself j just k keep keeps kept know
knows known l last lately later latter latterly least less lest let
let's like liked likely little look looking looks ltd m mainly many
may maybe me mean meanwhile merely might more moreover most
mostly much must my myself n name namely nd near nearly necessary
need needs neither never nevertheless new next nine no nobody non
none noone nor normally not nothing novel now nowhere o obviously of
off often oh ok okay old on once one ones only onto or other others otherwise ought
our ours ourselves out outside over overall own
p particular particularly per perhaps placed please plus possible presumably probably
provides q que quite qv r rather rd re really reasonably regarding regardless regards
relatively respectively right s said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven
several shall she should shouldn't since six
lib/AI/MicroStructure/WordBlacklist.pm view on Meta::CPAN
a able about above according accordingly across actually after afterwards again against aint all allow allows almost alone along already also although always am among amongst an and another any anybody anyhow anyone anything anyway anyways anywhere a...
b be became because become becomes becoming been before beforehand behind being believe below beside besides best better between beyond both brief but by
c came can cannot cant cant cause causes certain certainly changes clearly cmon co com come comes concerning consequently consider considering contain containing contains corresponding could couldnt course cs currently
d definitely described despite did didnt different do does doesnt doing done dont down downwards during
e each edu eg eight either else elsewhere enough entirely especially et etc even ever every everybody everyone everything everywhere ex exactly example except
f far few fifth first five followed following follows for former formerly forth four from further furthermore
g get gets getting given gives go goes going gone got gotten greetings
h had hadnt happens hardly has hasnt have havent having he hello help hence her here hereafter hereby herein heres hereupon hers herself hes hi him himself his hither hopefully how howbeit however
i id ie if ignored ill im immediate in inasmuch inc indeed indicate indicated indicates inner insofar instead into inward is isnt it itd itll its its itself ive
j just k keep keeps kept know known knows
l last lately later latter latterly least less lest let lets like liked likely little look looking looks ltd
m mainly many may maybe me mean meanwhile merely might more moreover most mostly much must my myself
n name namely nd near nearly necessary need needs neither never nevertheless new next nine no nobody non none noone nor normally not nothing novel now nowhere
o obviously of off often oh ok okay old on once one ones only onto or other others otherwise ought our ours ourselves out outside over overall own
p particular particularly per perhaps placed please plus possible presumably probably provides
q que quite qv r rather rd re really reasonably regarding regardless regards relatively respectively right
s said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several shall she should shouldnt since six so some somebody somehow someone something sometime sometimes some...
t take taken tell tends th than thank thanks thanx that thats thats the their theirs them themselves then thence there thereafter thereby therefore therein theres theres thereupon these they theyd theyll theyre theyve think third this thorough thorou...
u un under unfortunately unless unlikely until unto up upon us use used useful uses using usually uucp
v value various very via viz vs
w want wants was wasnt way we wed welcome well well went were were werent weve what whatever whats when whence whenever where whereafter whereas whereby wherein wheres whereupon wherever whether which while whither who whoever whole whom whos whose w...
lib/AI/MicroStructure/WordBlacklist.pm view on Meta::CPAN
i'd you'd he'd she'd we'd they'd i'll you'll he'll she'll we'll
they'll isn't aren't wasn't weren't hasn't haven't hadn't
doesn't don't didn't won't wouldn't shan't shouldn't can't
cannot couldn't mustn't let's that's who's what's here's
there's when's where's why's how's a an the and but if or
because as until while of at by for with about against between
into through during before after above below to from up down in
out on off over under again further then once here there when
where why how all any both each few more most other some such
no nor not only own same so than too very
a a's able about above according accordingly across actually after afterwards again against ain't all allow allows almost alone along already also although always am among amongst an and another any anybody anyhow anyone anything anyway anyways anywh...
qqq rrr sss ttt uuu vvv www xxx yyy zzz .... unsere ihrer uns wurde wer gegen diesem bis nur wieder unserem einer war man bei wir einen vom einem unter jeder werden wie als durch zum hat vor unseres email bel ihnen unseren bzw lieber uft kommen nicht...
anderweitigen anderweitiger anderweitiges anerkannt anerkannte anerkannter anerkanntes anfangen anfing angefangen angesetze angesetzt angesetzten angesetzter ans anscheinend ansetzen ansonst ansonsten anstatt anstelle arbeiten auch auf aufgehört aufg...
bessere besserem besseren besserer besseres bestehen besteht bestenfalls bestimmt bestimmte bestimmtem bestimmten bestimmter bestimmtes betraechtlich betraechtliche betraechtlichem betraechtlichen betraechtlicher betraechtliches betreffend betreffend...
diesseitiges diesseits dinge dir direkt direkte direkten direkter doch doppelt dort dorther dorthin dran drauf drei dreißig drin dritte drueber drum drunter drüber du dunklen durch durchaus durchweg durchwegs durfte durften dürfen dürfte eben ebenfal...
entsprechender entsprechendes entweder er ergo ergänze ergänzen ergänzte ergänzten erhalten erhielt erhielten erhält erneut erst erste erstem ersten erster erstere ersterem ersteren ersterer ersteres erstes eröffne eröffnen eröffnet eröffnete eröffne...
häufige häufigem häufigen häufiger häufigere häufigeren häufigerer häufigeres höchst höchstens ich igitt ihm ihn ihnen ihr ihre ihrem ihren ihrer ihres ihretwegen im immer immerhin immerwaehrend immerwaehrende immerwaehrendem immerwaehrenden immerwae...
jeglichen jeglicher jegliches jemals jemand jene jenem jenen jener jenes jenseitig jenseitigem jenseitiger jenseits jetzt jährig jährige jährigem jährigen jähriges kaeumlich kam kann kannst kaum kein keine keinem keinen keiner keinerlei keines keines...
naechste naemlich nahm naturgemaess naturgemaeß naturgemäss naturgemäß natürlich neben nebenan nehmen nein neu neue neuem neuen neuer neuerdings neuerlich neuerliche neuerlichem neuerlicher neuerliches neues neulich neun nicht nichts nichtsdestotrotz...
seine seinem seinen seiner seines seit seitdem seite seiten seither selbe selben selber selbst selbstredend selbstredende selbstredendem selbstredenden selbstredender selbstredendes seltsamerweise senke senken senkt senkte senkten setzen setzt setzte...
unmaßgeblichem unmaßgeblichen unmaßgeblicher unmaßgebliches unmoeglich unmoegliche unmoeglichem unmoeglichen unmoeglicher unmoegliches unmöglich unmögliche unmöglichen unmöglicher unnötig uns unsaeglich unsaegliche unsaeglichem unsaeglichen unsaeglic...
( run in 0.294 second using v1.01-cache-2.11-cpan-64827b87656 )