Apache-Tika

 view release on metacpan or  search on metacpan

lib/Apache/Tika.pm  view on Meta::CPAN

	# Perform request
	my $response = $self->{ua}->$method(
		$self->{url} . '/' . $path,
		%$headers,
		Content => $bodyBytes
	);

	# Check for errors
	# TODO

	return decode_utf8($response->decoded_content(charset => 'none'));
}

sub meta {
	my ($self, $bytes, $contentType) = @_;
	my $meta = $self->_request(
		'put',
		'meta',
		{
			'Accept' => 'application/json',
			$contentType? ('Content-type' => $contentType) : ()

lib/Apache/Tika.pod  view on Meta::CPAN

 open my $fh, '<:raw', '/local/file.pdf';
 my $pdf = do { local $/; <$fh> };
 close $fh;

 my $meta = $tika->meta($pdf);
 my $text = $tika->tika($pdf);

 # Extract text from a website
 my $response = LWP::UserAgent->get('http://some.web.site');
 my $text = $tika->tika(
  $r->decoded_content('charset' => 'none'),
  $r->headers->header('content-type')
 );

=head1 DESCRIPTION

This module provide Apache Tika api support

=head1 CONSTRUCTOR

=over 4

lib/Apache/Tika.pod  view on Meta::CPAN


=item $tika->tika($bytes, $contentType)

=item $tika->detect_stream($bytes)

=item $tika->language_stream($bytes)

=back

The $bytes parameter is always required and must contain the data to send to the server.
The $contentType is optional, but if know the $bytes content-type (p.e. "text/html; charset=iso-8") you can send it to improve the tika response.

=head1 SEE ALSO

L<Apache Tika|http://wiki.apache.org/tika/TikaJAXRS>


=cut



( run in 0.232 second using v1.01-cache-2.11-cpan-4d50c553e7e )