Catmandu-HOCR

 view release on metacpan or  search on metacpan

t/html/ossa.html  view on Meta::CPAN

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="" xml:lang="" xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Image: </title>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
    <meta content="ABBYY FineReader Engine 11" name="ocr-system"/>
    <meta content="ocr_page ocr_carea ocr_par ocr_line ocrx_word" name="ocr-capabilities"/>
  </head>
  <body>
    <div class="ocr_page" id="Page1" title="image 'image.jpg'; bbox 0 0 2904 3316; ppageno 0">
      <p class="ocr_par" dir="ltr" id="Page1_Block1" lang="fr" title="bbox 634 502 2242 707">

t/html/strong.html  view on Meta::CPAN

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <title>
</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
  <meta name='ocr-system' content='tesseract 3.04.00' />
  <meta name='ocr-capabilities' content='ocr_page ocr_carea ocr_par ocr_line ocrx_word'/>
</head>
<body>



( run in 0.645 second using v1.01-cache-2.11-cpan-49f99fa48dc )