Alvis-Convert

 view release on metacpan or  search on metacpan

t/test-data/to-split/29.xml  view on Meta::CPAN

        </urls>
      </acquisitionData>
      <canonicalDocument>        
        <section>This morning I described what is Google Co-op, but I also promised I would try to implement an example for this site. Well, we have implemented phase one of Google Co-op subscription links for this site. You can subscribe to the coop...
      <metaData>
        <meta name="title">Dynamic Implementation of Google Co-op for Search Engine Roundtable</meta>
        <meta name="dc:date">Thu, 11 May 2006 19:35:25 GMT</meta>
        <meta name="dc:type">text/html</meta>
      </metaData>
      <links>
        <outlinks>
          <link type="a">
            <anchorText>subscribe</anchorText>
            <location>http://www.google.com/coop/trust/add?user=015090516856763095929&amp;continue=http://www.google.com/coop/profile?user=015090516856763095929&amp;sig=Y_aOf96WG5HGmgVEImc3p144xnXGY=</location>
          </link>
          <link type="a">
            <location>http://www.google.com/coop/trust/add?user=015090516856763095929&amp;continue=http://www.google.com/coop/profile?user=015090516856763095929&amp;sig=Y_aOf96WG5HGmgVEImc3p144xnXGY=</location>
          </link>
          <link type="a">
            <anchorText>Google AdSense</anchorText>
            <location>http://www.google.com/search?q=Google+AdSense</location>
          </link>
          <link type="a">
            <anchorText>SER Categories</anchorText>
            <location>http://www.seroundtable.com/archives.html#category</location>
          </link>
          <link type="a">
            <anchorText>what is Google Co-op</anchorText>
            <location>http://www.seroundtable.com/archives/003796.html</location>
          </link>
          <link type="a">
            <anchorText>by clicking here</anchorText>
            <location>http://www.google.com/coop/profile?user=015090516856763095929</location>
          </link>
          <link type="a">
            <anchorText>Link Building</anchorText>
            <location>http://www.google.com/search?q=Link+Building</location>
          </link>
        </outlinks>
      </links>
    </acquisition>
  <linguisticAnalysis>
    <semantic_unit_level>
      <semantic_unit><named_entity><form>Google</form><named_entity_type>comp</named_entity_type></named_entity></semantic_unit>
      <semantic_unit><named_entity><form>Google</form><named_entity_type>soft</named_entity_type></named_entity></semantic_unit>
      <semantic_unit><named_entity><form>Google AdSense</form><named_entity_type>soft</named_entity_type></named_entity></semantic_unit>
    </semantic_unit_level>
  </linguisticAnalysis>

  </documentRecord>
<documentRecord id="57E3FF55199853DF2777EF6B8DC24516" xmlns="http://alvis.info/enriched/">
    <acquisition>
      <acquisitionData>
        <modifiedDate>1149969689989</modifiedDate>
        <httpServer>Apache</httpServer>
        <urls>
          <url>http://searchenginewatch.com/searchday/article.php/3612406</url>
        </urls>
      </acquisitionData>
      <canonicalDocument>        
        <section>Links to the week's topics from search engine forums across the web. What Top 5 Skills Would You Study to Become a Better SEO? Search Engine Watch Forums "What skills would you put on your Matrix 'must have' list for your career path...
      <metaData>
        <meta name="title">Search Engine Forums Spotlight</meta>
        <meta name="dc:type">text/html</meta>
      </metaData>
      <links>
        <outlinks>
          <link type="a">
            <anchorText>June 2006: Start of the Traditional Summer Slump</anchorText>
            <location>http://www.webmasterworld.com/forum89/14428.htm</location>
          </link>
          <link type="a">
            <anchorText>Google Goes to Congress to Block IAPs Charging for Faster Data</anchorText>
            <location>http://www.webmasterworld.com/forum86/4531.htm</location>
          </link>
          <link type="a">
            <anchorText>Search Engine Guide</anchorText>
            <location>http://www.searchengineguide.com/</location>
          </link>
          <link type="a">
            <anchorText>Does Citing Sources Help Rankings?</anchorText>
            <location>http://www.v7n.com/forums/google-forum/31501-does-citing-sources-help-rankings.html</location>
          </link>
          <link type="a">
            <anchorText>Cache Problems Growing for Directories?</anchorText>
            <location>http://forums.searchenginewatch.com/showthread.php?threadid=11916</location>
          </link>
          <link type="a">
            <anchorText>What Top 5 Skills Would You Study to Become a Better SEO?</anchorText>
            <location>http://forums.searchenginewatch.com/showthread.php?t=11945</location>
          </link>
          <link type="a">
            <anchorText>Google Office Continued: Spreadsheet Application Launched</anchorText>
            <location>http://www.cre8asiteforums.com/forums/index.php?showtopic=37455</location>
          </link>
          <link type="a">
            <anchorText>Separate Page for PPC?</anchorText>
            <location>http://www.webproworld.com/viewtopic.php?t=64119</location>
          </link>
          <link type="a">
            <anchorText>Brin Says Google Compromised Principles</anchorText>
            <location>http://www.webmasterworld.com/forum86/4529.htm</location>
          </link>
          <link type="a">
            <anchorText>Is Reciprocal Linking Dead</anchorText>
            <location>http://www.highrankings.com/forum/index.php?showtopic=22885</location>
          </link>
          <link type="a">
            <anchorText>Google Browser Sync For FireFox</anchorText>
            <location>http://www.webmasterworld.com/forum30/34677.htm</location>
          </link>
        </outlinks>
      </links>
    </acquisition>
  <linguisticAnalysis>
    <semantic_unit_level>
      <semantic_unit><named_entity><form>John McCain</form><named_entity_type>person</named_entity_type></named_entity></semantic_unit>
      <semantic_unit><named_entity><form>Sergey Brin</form><named_entity_type>person</named_entity_type></named_entity></semantic_unit>
      <semantic_unit><named_entity><form>Brin</form><named_entity_type>person</named_entity_type></named_entity></semantic_unit>
      <semantic_unit><named_entity><form>Google Inc</form><named_entity_type>comp</named_entity_type></named_entity></semantic_unit>
      <semantic_unit><named_entity><form>Reciprocal</form><named_entity_type>comp</named_entity_type></named_entity></semantic_unit>



( run in 0.439 second using v1.01-cache-2.11-cpan-5b529ec07f3 )