Astro-SIMBAD-Client
view release on metacpan or search on metacpan
lib/Astro/SIMBAD/Client.pm view on Meta::CPAN
Eventually the SOAP code will be removed. In the meantime all tests are
skipped unless C<ASTRO_SIMBAD_CLIENT_USE_SOAP> is true, and are marked
TODO. Support of SOAP by this module will be on a best-effort basis;
that is, if I can make it work without a huge amount of work I will --
otherwise SOAP will become unsupported.
=head1 DESCRIPTION
This package implements several query interfaces to version 4 of the
SIMBAD on-line astronomical database, as documented at
L<http://simbad.u-strasbg.fr/simbad4.htx>. B<This package will not work
with SIMBAD version 3.> Its primary purpose is to obtain SIMBAD data,
though some rudimentary parsing functionality also exists.
There are three ways to access these data.
- URL queries are essentially page scrapers, but their use is
documented, and output is available as HTML, text, or VOTable. URL
queries are implemented by the url_query() method.
- Scripts may be submitted using the script() or script_file() methods.
The former takes as its argument the text of the script, the latter
takes a file name.
- Queries may be made using the web services (SOAP) interface. The
query() method implements this, and queryObjectByBib,
queryObjectByCoord, and queryObjectById have been provided as
convenience methods. As of version 0.027_01, SOAP queries are
deprecated. See the L<NOTICE|/NOTICE> section above for the deprecation
schedule.
Astro::SIMBAD::Client is object-oriented, with the object supplying not
only the URL scheme and SIMBAD server name, but the default format and
output type for URL and web service queries.
A simple command line client application is also provided, as are
various examples in the F<eg> directory.
=head2 Methods
The following methods should be considered public:
=over 4
=cut
package Astro::SIMBAD::Client;
# We require Perl 5.008 because of MailTools, used by SOAP::Lite.
# Otherwise it would be 5.006 because of 'our'.
use 5.008;
use strict;
use warnings;
use Carp;
use LWP::UserAgent;
use LWP::Protocol;
use HTTP::Request::Common qw{POST};
use Scalar::Util 1.01 qw{looks_like_number};
use URI::Escape ();
# use XML::DoubleEncodedEntities;
# use Astro::SIMBAD::Client::WSQueryInterfaceService;
use constant HAVE_DOUBLE_ENCODED => do {
local $@ = undef;
eval { ## no critic (RequireCheckingReturnValueOfEval)
require XML::DoubleEncodedEntities;
1;
};
};
use constant ARRAY_REF => ref [];
use constant CODE_REF => ref sub {};
my $have_time_hires;
BEGIN {
$have_time_hires = eval {
require Time::HiRes;
Time::HiRes->import (qw{time sleep});
1;
};
*_escape_uri = URI::Escape->can( 'uri_escape_utf8' )
|| URI::Escape->can( 'uri_escape' )
|| sub { return $_[0] };
}
our $VERSION = '0.048';
our @CARP_NOT = qw{Astro::SIMBAD::Client::WSQueryInterfaceService};
# TODO replace this with s///r if we ever get to the point where we
# require Perl 5.13.2 or greater.
sub _strip_returns {
my ( $data ) = @_;
$data =~ s/ \n //smxg;
return $data;
}
use constant FORMAT_TXT_SIMPLE_BASIC => _strip_returns( <<'EOD' );
---\n
name: %IDLIST(NAME|1)\n
type: %OTYPE\n
long: %OTYPELIST\n
ra: %COO(d;A)\n
dec: %COO(d;D)\n
plx: %PLX(V)\n
pmra: %PM(A)\n
pmdec: %PM(D)\n
radial: %RV(V)\n
redshift: %RV(Z)\n
spec: %SP(S)\n
bmag: %FLUXLIST(B)[%flux(F)]\n
vmag: %FLUXLIST(V)[%flux(F)]\n
ident: %IDLIST[%*,]
EOD
use constant FORMAT_TXT_YAML_BASIC => _strip_returns( <<'EOD' );
---\n
lib/Astro/SIMBAD/Client.pm view on Meta::CPAN
warn "Debug - Parsed to:\n", YAML::Dump( \@rslt ), ' ';
};
return wantarray ? @rslt : \@rslt;
} else {
$debug
and warn "Debug - No parser for $arg{parser}";
return $rslt;
}
}
}
=item $value = $simbad->script_file ($filename);
This method submits the given script file to SIMBAD, returning the
result of the script. Unlike script(), the argument is the name of the
file containing the script, not the text of the script. However, if a
parser for 'script' has been specified, it will be applied to the
output.
=cut
sub script_file {
my ( $self, $file ) = @_;
my $url = $self->__build_url( 'simbad/sim-script' );
my $rqst = POST $url,
Content_Type => 'form-data',
Content => [
submit => 'submit file',
scriptFile => [$file, undef],
# May need to specify Content_Type => application/octet-stream.
];
my $resp = $self->_retrieve( $rqst );
my $rslt = $resp->content or return;
unless ($self->get ('verbatim')) {
$rslt =~ s/.*?::data:+\s*//sm or croak $rslt;
}
if (my $parser = $self->_get_parser ('script')) {
## $rslt =~ s/.*?::data:+.?$//sm or croak $rslt;
## $rslt =~ s/\s+//sm;
my @rslt = $parser->($rslt);
return wantarray ? @rslt : \@rslt;
} else {
return $rslt;
}
}
=item $simbad->set ($name => $value ...);
This method sets the value of the given L<attributes|/Attributes>. More
than one name/value pair may be specified. If called as a static method,
it sets the default value of the attribute.
=cut
{ # Begin local symbol block.
my $ckpn = sub {
(looks_like_number ($_[2]) && $_[2] >= 0)
or croak "Attribute '$_[1]' must be a non-negative number";
+$_[2];
};
my %mutator = (
format => \&_set_hash,
parser => \&_set_hash,
scheme => \&_set_scheme,
url_args => \&_set_hash,
);
my %transform = (
delay => ($have_time_hires ?
$ckpn :
sub {+sprintf '%d', $ckpn->(@_) + .5}),
format => sub {
## my ( $self, $name, $val, $key ) = @_;
my ( $self, undef, $val ) = @_; # Name and key unused
if ($val !~ m/\W/ && (my $code = eval {
$self->_get_coderef ($val)})) {
$val = $code->();
}
$val;
},
parser => sub {
## my ( $self, $name, $val, $key ) = @_;
my ( $self, undef, $val ) = @_; # Name and key unused
if (!ref $val) {
unless ($val =~ m/::/) {
my $pkg = $self->_parse_subroutine_name ($val);
$val = $pkg . '::' . $val;
}
$self->_get_coderef ($val); # Just to see if we can.
} elsif ( CODE_REF ne ref $val ) {
croak "Error - $_[1] value must be scalar or code reference";
}
$val;
},
);
foreach my $key (keys %static) {
$transform{$key} ||= sub {$_[2]};
$mutator{$key} ||= sub {
my $hash = ref $_[0] ? $_[0] : \%static;
$hash->{$_[1]} = $transform{$_[1]}->(@_)
};
}
sub set {
my ($self, @args) = @_;
croak "Error - First argument must be an @{[__PACKAGE__]} object"
unless eval {$self->isa(__PACKAGE__)};
while (@args) {
my $name = shift @args;
croak "Error - Attribute '$name' is unknown"
unless exists $mutator{$name};
$mutator{$name}->($self, $name, shift @args);
}
return $self;
}
lib/Astro/SIMBAD/Client.pm view on Meta::CPAN
$simbad->set (format => 'txt=%MAIN_ID\n');
does the same thing as the previous example. Specifying the key name
without an = sign deletes the key (e.g. set (format => 'txt')).
The Astro::SIMBAD::Client class has the following attributes:
=over
=item autoload
This Boolean attribute determines whether setting the parser should
attempt to autoload its package.
The default is 1 (i.e. true).
=item debug
This integer attribute turns on debug output. It is unsupported in the
sense that the author makes no claim what will happen if it is non-zero.
The default value is 0.
=item delay
This numeric attribute sets the minimum delay in seconds between
requests, so as not to overload the SIMBAD server. If
L<Time::HiRes|Time::HiRes> can be loaded, you can set delays in
fractions of a second; otherwise the delays will be rounded to the
nearest second.
Delays are from the time of the last request to the server, no matter
which object issued the request. The delay can be set to 0, but not to a
negative number.
The default is 3.
=item emulate_soap_queries
If this Boolean attribute is true, the methods that would normally use
the SOAP interface (that is, C<query()> and friends) use the script
interface instead.
The purpose of this attribute is to give the user a way to manage the
deprecation and ultimate removal of the SOAP interface from the SIMBAD
servers. It may go away once that interface disappears, but it will be
put through a deprecation cycle.
The default is false, but will become true once the University of
Strasbourg shuts down its SOAP server.
=item format
This attribute holds the default format for a given query()
output type. It is specified as a reference to a hash. See
L<http://simbad.u-strasbg.fr/guide/sim-fscript.htx> for how to specify
formats for each output type. Output type 'script' is used to specify a
format for the script() method.
The format can be specified either literally, or as a subroutine name or
code reference. A string is assumed to be a subroutine name if it looks
like one (i.e. matches (\w+::)*\w+), and if the given subroutine is
actually defined. If no namespace is specified, all namespaces in the
call tree are checked. If a code reference or subroutine name is
specified, that code is executed, and the result becomes the format.
The following formats are defined in this module:
FORMAT_TXT_SIMPLE_BASIC -
a simple-to-parse text format providing basic information;
FORMAT_TXT_YAML_BASIC -
pseudo-YAML (parsable by YAML::Load) providing basic info;
FORMAT_VO_BASIC -
VOTable field names providing basic information.
The FORMAT_TXT_YAML_BASIC format attempts to provide data structured
similarly to the output of L<Astro::SIMBAD>, though
Astro::SIMBAD::Client does not bless the output into any class.
A simple way to examine these formats is (e.g.)
use Astro::SIMBAD::Client;
print Astro::SIMBAD::Client->FORMAT_TXT_YAML_BASIC;
Before a format is actually used it is preprocessed in a manner
depending on its intended output type. For 'vo' formats, leading and
trailing whitespace are stripped. For 'txt' and 'script' formats, line
breaks are stripped.
The default specifies formats for output types 'txt' and 'vo'. The
'txt' default is FORMAT_TXT_YAML_BASIC; the 'vo' default is
FORMAT_VO_BASIC.
There is no way to specify a default format for the 'script_file'
method.
=item parser
This attribute specifies the parser for a given output type. The actual
value is a hash reference; the keys are valid output types, and the
values are as described below.
Parsers may be specified by either a code reference, or by the
text name of a subroutine. If specified as text and the name
is not qualified by a package name, the calling package is assumed.
The parser must be defined, and must take as its lone argument
the text to be parsed.
If the parser for a given output type is defined, query results of that
type will be passed to the parser, and the result returned. Otherwise
the query results will be returned verbatim.
The output types are anything legal for the query() method (i.e. 'txt'
and 'vo' at the moment), plus 'script' for a script parser. All default
to '', meaning no parser is used.
=item post
This Boolean attribute specifies that url_query() data should be
acquired using a POST request. If false, a GET request is used.
( run in 0.555 second using v1.01-cache-2.11-cpan-39bf76dae61 )