Alvis-Convert
view release on metacpan or search on metacpan
t/test-data/to-split/29.xml view on Meta::CPAN
<?xml version="1.0" encoding="UTF-8"?>
<documentCollection xmlns="http://alvis.info/enriched/" version="1.1">
<documentRecord id="A4AFC8E9BD3073A4EFADEB400B80D54A" xmlns="http://alvis.info/enriched/">
<acquisition>
<acquisitionData>
<modifiedDate>1146649940912</modifiedDate>
<httpServer>Apache/1.3.34 (Unix) mod_fastcgi/2.4.2 mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.4.2 FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7i</httpServer>
<urls>
<url>http://www.searchenginejournal.com/?p=3363</url>
</urls>
</acquisitionData>
<canonicalDocument>
<section>Yahooâs YPN Says No to MySpace Traffic If you use MySpace profiles, blogs, comments, and mailings to spam or influence the teenie boppers over at MySpace to clickover to your website and that MySpace traffic is a major source of yo...
<metaData>
<meta name="title">Yahooâs YPN Says No to MySpace Traffic</meta>
<meta name="dc:type">text/html; charset=utf-8</meta>
</metaData>
<links>
<outlinks>
<link type="a">
<anchorText>Jen Slegg</anchorText>
<location>http://www.jensense.com/archives/2006/05/myspacecom_and.html</location>
</link>
<link type="a">
<anchorText>Problogger.net</anchorText>
<location>http://www.problogger.net/archives/2006/05/03/yahoo-publisher-network-terminates-more-publisher-accounts/</location>
</link>
</outlinks>
</links>
</acquisition>
<linguisticAnalysis>
<semantic_unit_level>
<semantic_unit><named_entity><form>Yahoo</form><named_entity_type>comp</named_entity_type></named_entity></semantic_unit>
<semantic_unit><named_entity><form>Google</form><named_entity_type>comp</named_entity_type></named_entity></semantic_unit>
<semantic_unit><named_entity><form>Yahoo Search Marketing</form><named_entity_type>soft</named_entity_type></named_entity></semantic_unit>
<semantic_unit><named_entity><form>Yahoo Publisher Network</form><named_entity_type>soft</named_entity_type></named_entity></semantic_unit>
<semantic_unit><named_entity><form>Google AdSense</form><named_entity_type>soft</named_entity_type></named_entity></semantic_unit>
</semantic_unit_level>
</linguisticAnalysis>
</documentRecord>
<documentRecord id="A62EEF2D8BE45A8D097087B515598C68" xmlns="http://alvis.info/enriched/">
<acquisition>
<acquisitionData>
<modifiedDate>1148355445154</modifiedDate>
<httpServer>Apache/1.3.34 (Unix) DAV/1.0.3 mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.4.1 FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7a</httpServer>
<urls>
<url>http://battellemedia.com/archives/002584.php</url>
</urls>
</acquisitionData>
<canonicalDocument>
<section>Two items of very related interest today: 1. Wired News Releases Full Text of AT&T NSA Document (Slashdot). 2. Gonzales Says Publishing Leaks Is A Crime (Also Slashdot) Thank God for outlets like Wired. And best of luck.</section...
<metaData>
<meta name="title">Wired News: Will the US Sue?</meta>
<meta name="dc:type">text/html</meta>
</metaData>
<links>
<outlinks>
<link type="a">
<anchorText>Gonzales Says Publishing Leaks Is A Crime</anchorText>
<location>http://yro.slashdot.org/article.pl?sid=06/05/22/1039257&from=rss</location>
</link>
<link type="a">
<anchorText>Wired News Releases Full Text of AT&T NSA Document</anchorText>
<location>http://yro.slashdot.org/article.pl?sid=06/05/22/132206</location>
</link>
</outlinks>
</links>
</acquisition>
<linguisticAnalysis>
<semantic_unit_level>
</semantic_unit_level>
</linguisticAnalysis>
</documentRecord>
<documentRecord id="FF2C88E89A1DDFE4F8CD4845EEC285E3" xmlns="http://alvis.info/enriched/">
<acquisition>
<acquisitionData>
<modifiedDate>1142938329956</modifiedDate>
<httpServer>Apache</httpServer>
<urls>
<url>http://searchenginewatch.com/searchday/article.php/3592876</url>
</urls>
</acquisitionData>
<canonicalDocument>
<section>At long last, Google has launched its ownGoogle Finance service. For years, those seeking specialty financial information via Google have been sent to competitors such as Yahoo and MSN. Now Google's providing financial information di...
<metaData>
<meta name="title">Google Launches Google Finance</meta>
<meta name="dc:type">text/html</meta>
</metaData>
<links>
<outlinks>
<link type="a">
<anchorText>
wrote</anchorText>
<location>http://searchenginewatch.com/_subscribers/articles/article.php/3353401</location>
</link>
<link type="a">
<anchorText>Google Groups</anchorText>
<location>http://groups.google.com/</location>
</link>
<link type="a">
<anchorText>
Forrester</anchorText>
<location>http://searchenginewatch.com/_subscribers/updates/article.php/3326461#forrester</location>
</link>
<link type="a">
<anchorText>
enhancements</anchorText>
<location>http://searchenginewatch.com/searchday/article.php/2160891</location>
</link>
<link type="a">
( run in 0.717 second using v1.01-cache-2.11-cpan-39bf76dae61 )