App-wdq
view release on metacpan or search on metacpan
. join( ' ] [ ',
cName('MODE'), cName('OPTIONS'),
cName('REQUEST') . ' | ' . cName('< REQUEST_FILE') )
. " ]\n",
-sections => ['SYNOPSIS'],
);
$help =~ s/\n\n -/\n -/gm;
$help =~ s/^ / /mg;
}
print $help;
exit;
}
if ( $OPT{man} ) {
my $module = $OPT{color} ? 'Pod::Text::Color' : 'Pod::Text';
# may fail if pure script installed by hand
eval "require $module; require App::wdq"; ## no critic
$module->new->parse_from_file( $INC{'App/wdq.pm'} // $0 );
exit;
}
# default SPARQL endpoint
$OPT{api} //= 'https://query.wikidata.org/bigdata/namespace/wdq/sparql';
# add default prefixes by default
$OPT{'default-prefixes'} //= 1;
# include header in output
$OPT{header} //= $OPT{H} ? 0 : 1;
# limit given as single digit option
foreach ( grep { $OPT{$_} } 1 .. 9 ) {
$OPT{limit} = $_ if !$OPT{limit} or $OPT{limit} > $_;
}
# validate language and set default value if missing
$OPT{language} //= do { my $l = $ENV{LANG} // 'en'; $l =~ s/_.*//; $l };
$OPT{language} = lc( $OPT{language} );
if ( grep { $_ !~ $LANGUAGE_PATTERN } split ",", $OPT{language} ) {
warning("invalid language(s): $OPT{language}");
exit 1;
}
# disable all requests
if ( $OPT{N} ) {
$OPT{'no-mediawiki'} = 1;
$OPT{'no-execute'} = 1;
}
my $MODE = !@ARGV ? 'query' : do {
my $arg = $ARGV[0];
$arg =~ s/^\s*|\s*$//g;
if ( $arg =~ /^(query|lookup|p?search)$/ ) {
lc( shift @ARGV );
}
elsif ( $arg =~ $ENTITY_PATTERN or $arg =~ $SITELINK_PATTERN ) {
'lookup';
}
else {
my $guess = () = $arg =~ /[a-z]+:[^\s]/gi;
$guess += () = $arg =~ /<[^>]+>/g;
$guess += () = $arg =~ /[?\$][^\s]/g;
if ( $guess > 2 ) {
warning("ignoring additional command line argument")
if $OPT{query} or @ARGV > 1;
'query';
}
else {
'search';
}
}
};
# default output format
if ( $OPT{format} =~ /{[^}]+}/ ) {
$OPT{pretty} = $OPT{format};
$OPT{format} = 'text';
}
else {
$OPT{format} =
lc( $OPT{format} // ( $MODE =~ /^p?search$/ ? 'text' : 'simple' ) );
if ( $OPT{format} eq 'text' ) {
$OPT{pretty} //=
$OPT{count}
? "{count|style=v}"
: "{id|style=i}{label|style=v|pre=: }"
. "{alias|pre= (|post=)|style=v}"
. "{description|length=78|pre=\n }";
$OPT{ids} //= 1;
}
}
# require only if actually needed
require RDF::Query;
# monkey-patch RDF::Query to keep minimum required version at Ubuntu 14.04 LTS
require version;
if ( version->parse($RDF::Query::VERSION) < version->parse('2.915_01') ) {
require RDF::Query::Parser::SPARQL;
*RDF::Query::Node::Resource::as_sparql = sub {
my $self = shift;
my $context = shift || {};
my $uri = $self->uri_value;
my $ns = $context->{namespaces} || {};
my %ns = %$ns;
foreach my $k ( keys %ns ) {
no warnings 'uninitialized';
if ( $k eq '__DEFAULT__' ) {
$k = '';
}
my $v = $ns{$k};
if ( index( $uri, $v ) == 0 ) {
my $local = substr( $uri, length($v) );
if ( $local =~ $RDF::Query::Parser::SPARQL::r_PN_LOCAL ) {
my $qname = join( ':', $k, $local );
return $qname;
}
}
}
'<' . URI->new( encode_utf8( $self->uri_value ) )->canonical . '>';
}
}
wdq psearch -g es parte # search property 'parte' in Spanish
wdq P361 Q544 # lookup properties and items
wdq '?c wdt:P361 wd:Q544' # query parts of the solar system
See the manual for details or get help via C<wdq help>:
wdq help options # list and explain command line options
wdq help modes # list and explain request modes
wdq help output # explain output control
wdq help formats # list and explain output formats
wdq help ontology # show Wikidata ontology in a nutshell
wdq help prefixes # list RDF prefixes allowed in queries
wdq help version # show version of wdq
=head1 DESCRIPTION
The command line script C<wdq>, included in CPAN module L<App::wdq>, provides a
tool to access L<Wikidata Query Service|https://query.wikidata.org/>. It
supports formulation and execution of L<SPARQL SELECT
queries|http://www.w3.org/TR/sparql11-query/#select> to extract selected
information from Wikidata or other Wikibase instances.
=head1 INSTALLATION
Perl should already installed at most operating systems. Otherwise
L<get Perl!|https://www.perl.org/get.html>
=head2 FROM CPAN
Install sources from CPAN including all dependencies:
cpanm App::wdq
First L<install cpanm|https://github.com/miyagawa/cpanminus/#installation> if
missing. If installation of C<App::wdq> fails try cpanm option C<--notest> or
install dependencies as packages as described below.
=head2 PREBUILD PACKAGES
Install dependencies as prebuild packages for your operating system:
# Debian based systems e.g. Ubuntu (>= 14.04)
sudo apt-get install libhttp-tiny-perl librdf-query-perl
# Windows/ActiveState
ppm install HTTP-Tiny
ppm install RDF-Query
Then install C<wdq> from CPAN as described above or copy the script to some
place in your C<$PATH>:
wget https://github.com/nichtich/wdq/raw/master/script/wdq
chmod +x wdq
The latter method will not install this documentation.
=head1 MODES
Request mode C<query> (default), C<lookup>, C<serch>, or C<psearch> can
explicitly be set via first argument or it's guessed from arguments.
=head2 query
Read SPARQL query from STDIN, option C<--query|-q>, or argument. Namespace
definitions and C<SELECT> clause are added if missing.
wdq '?c wdt:P361 wd:Q544'
wdq '{ ?c wdt:P361 wd:Q544 }' # equivalent
wdq 'SELECT * WHERE { ?c wdt:P361 wd:Q544 }' # equivalent
wdq < queryfile
=head2 lookup
Read Wikidata entity ids, URLs, or Wikimedia project URLs from STDIN or
arguments. Result fields are C<label>, C<description>, and C<id>:
wdq Q1
wdq lookup Q1 # equivalent
echo Q1 | wdq lookup # equivalent
wdq http://de.wikipedia.org/wiki/Universum # same result
=encoding utf8
=head2 search / psearch
Search for items or properties. Result fields are C<label>, C<id>,
C<description>, and possibly matched C<alias>. Search and result language is
read from environment or option C<--language>/C<-g>:
wdq search -g sv Pippi LÃ¥ngstrump
Default output format in search mode is C<text>.
=head1 OPTIONS
=over
=item --query|-q QUERY
Query or query file (C<-> for STDIN as default)
=item --format|-f FORMAT|TEMPLATE
Output format or string template. Call C<wdq help formats> for details.
=item --export EXPORTER
Use a L<Catmandu> exporter as output format.
=item --no-header|-H
Exclude header in CSV output or other exporter.
=item --enumerate|-e
Enumerate results by adding a counter variable C<n>
=item --limit INTEGER
Add or override a LIMIT clause to limitate the number of results. Single-digit
( run in 3.240 seconds using v1.01-cache-2.11-cpan-75ffa21a3d4 )