Daizu

 view release on metacpan or  search on metacpan

lib/Daizu/File.pm  view on Meta::CPAN


=cut

sub file_at_path
{
    my ($self, $path) = @_;
    my $wc_id = $self->{wc_id};

    # The 'daizu' scheme is used for XInclude expansion, so might as well
    # use it here too.  It doesn't really matter, we just want to get the
    # URI module to resolve the relative path for us.
    my $base = 'daizu:///' . url_encode($self->{path});
    $base .= '/' if $self->{is_dir};

    $path =~ s!/\./!/!g;
    my $abs_path = URI->new($path)->abs($base)->path;
    $abs_path =~ s!^/!!;
    $abs_path = url_decode($abs_path);

    my $wc = Daizu::Wc->new($self->{cms}, $self->{wc_id});
    return $wc->file_at_path($abs_path);
}

=item $file-E<gt>issued_at

Return a L<DateTime> object for the publication date and time of the file.
All files have an 'issued' date, either specified explicitly in a
C<dcterms:issued> property, or determined from the time at which the file
was first committed into the Subversion repository (which is assumed to
be about the time it was first published).

=cut

sub issued_at
{
    my ($self) = @_;
    return parse_db_datetime($self->{issued_at});
}

=item $file-E<gt>modified_at

Return a L<DateTime> object for the last-updated date and time of the file.
This is always defined.  The value is either specified explicitly in a
C<dcterms:modified> property, or determined from the time of the last commit
which modified or renamed the file.

=cut

sub modified_at
{
    my ($self) = @_;
    return parse_db_datetime($self->{modified_at});
}

=item $file-E<gt>property($name)

Return the value of the Subversion property C<$name> on this file, or
undef if there is no such property.

The value is assumed to be text, so leading and trailing whitespace is
trimmed off, and the value is decoded as UTF-8.  If the value exists
but contains only whitespace then undef is returned.

=cut

sub property
{
    my ($self, $name) = @_;
    my $value = db_select($self->{db}, 'wc_property',
        { file_id => $self->{id}, name => $name },
        'value',
    );
    $value = trim_with_empty_null($value);
    return decode('UTF-8', $value, Encode::FB_CROAK);
}

=item $file-E<gt>most_specific_property($name)

Return the value of the Subversion property C<$name> on this file, or on its
closest ancestor if it has no such property.  Therefore properties on
subdirectories will override those of their parent directories.  Returns
undef if there is no property of this name on the file or any of its
ancestors.  Properties whose values are empty or contain only whitespace
are ignored.

The value is assumed to be text, so leading and trailing whitespace is
trimmed off, and the value is decoded as UTF-8.

=cut

sub most_specific_property
{
    my ($file, $name) = @_;

    while (defined $file) {
        my $value = $file->property($name);
        if (defined $value && $value =~ /\S/) {
            $value = trim($value);
            return decode('UTF-8', $value, Encode::FB_CROAK);
        }
        $file = $file->parent;
    }

    return undef;
}

=item $file-E<gt>least_specific_property($name)

Return the value of the Subversion property C<$name> on this file, or on its
most distant ancestor if it has no such property.  Therefore the return
value is the 'top level' value for this property.  For example, if you ask
for the C<dc:title> property then you might get the title of the website
of which C<$file> is a part.  Returns undef if there is no property of
this name on the file or any of its ancestors.  Properties whose values are
empty or contain only whitespace are ignored.

The value is assumed to be text, so leading and trailing whitespace is
trimmed off, and the value is decoded as UTF-8.

=cut

sub least_specific_property
{
    my ($file, $name) = @_;

    my $best;
    while (defined $file) {
        my $value = $file->property($name);
        $best = trim($value)
            if defined $value && $value =~ /\S/;
        $file = $file->parent;
    }

    return decode('UTF-8', $best, Encode::FB_CROAK);
}

=item $file-E<gt>homepage_file

Return the file which most probably represents the 'homepage' of the
website on which C<$file> will be published.  This will be the file
closest to the top level of the filesystem hierarchy which has a
C<daizu:url> property set.

It is possible for this to return C<$file> itself if there is nothing
above it with a URL.  Returns undef if not even C<$file> has a URL
set, in which case it can't have a homepage because it won't be published
itself.

=cut

sub homepage_file
{
    my ($file) = @_;

    my $best;
    while (defined $file) {
        $best = $file if defined $file->{custom_url};
        $file = $file->parent;
    }

    return $best;
}

=item $file-E<gt>title

Return the title of C<$file>, as a decoded Perl text string, or undef
if the file doesn't have a title.  The title is taken from the file's
C<dc:title> property.

=cut

sub title { shift->{title} }

=item $file-E<gt>short_title

Return the 'short-title' of C<$file>, as a decoded Perl text string, or undef
if the file doesn't have a title.  The title is taken from the file's
C<dc:title> property.

=cut

sub short_title { shift->{short_title} }

=item $file-E<gt>description

Return the description/summary of C<$file>, as a decoded Perl text string,
or undef if the file doesn't have a description.  The value is taken from
the file's C<dc:description> property.

=cut

sub description { shift->{description} }

=item $file-E<gt>generator

Create and return a generator object for the file C<$file>.
Figures out which generator class to use,
by looking at the C<daizu:generator> property for the file, and if
necessary its ancestors.  The class is loaded automatically.
It also knows to use L<Daizu::Gen> if no generator specification is found.

Returns the new object, which should support the API of class L<Daizu::Gen>.

=cut

sub generator
{
    my ($self) = @_;
    return $self->{generator_obj} if exists $self->{generator_obj};
    my $cms = $self->{cms};

    my $root_file = $self;
    $root_file = Daizu::File->new($cms, $self->{root_file_id})
        if defined $self->{root_file_id};
    my $generator = instantiate_generator($cms, $self->{generator},
                                          $root_file);

    $self->{generator_obj} = $generator;
    return $generator;
}

=item $file-E<gt>update_loaded_article_in_db

Updates the cached article content for C<$file> in the database.
This includes the finished XML version of the content,
the article pages URL, the extra URLs, and the extra templates.
Does nothing if the file isn't an article.
Fails if it is but there are no plugins able to load it.

This normally happens automatically when a file's content is updated,
and can also be triggered manually from the C<daizu> program with
the 

This is where article loader plugins set with the L<Daizu> method
L<add_article_loader()|Daizu/$cms-E<gt>add_article_loader($mime_type, $path, $object, $method)>
are invoked.

Doesn't return anything.

=cut

sub update_loaded_article_in_db
{
    my ($self) = @_;
    return unless $self->{article};
    return transactionally($self->{db}, \&_update_loaded_article_in_db_txn,

lib/Daizu/File.pm  view on Meta::CPAN

    }

    return @{$self->{article_extra_urls}};
}

=item $file-E<gt>article_extra_templates

Returns a list of the extra templates which should be included
in the article's 'extras' column.  Returns nothing for files which
aren't articles.

=cut

sub article_extra_templates
{
    my ($self) = @_;
    return unless $self->{article};

    if (!exists $self->{article_extra_templates}) {
        my $sth = $self->{db}->prepare(q{
            select filename
            from wc_article_extra_template
            where file_id = ?
        });
        $sth->execute($self->{id});

        my @extra;
        while (my ($filename) = $sth->fetchrow_array) {
            push @extra, $filename
        }

        $self->{article_extra_templates} = \@extra;
    }

    return @{$self->{article_extra_templates}};
}

=item $file-E<gt>tags

Return a reference to an array of tags which have been applied to this article.
These come ultimately from the C<daizu:tags> property, although it is
loaded into the database tables C<tag> and C<wc_file_tag> when the
working copy is updated.  The tags are returned sorted by canonical tag
name.

Each item of the returned array is a hashref containing the following values:

=over

=item tag

The canonical tag name, as used as the primary key in the C<tag> table.

=item original_spelling

The spelling used for to name the tag in the C<daizu:tags> property of
this file.

=back

Both of these values are provided as text strings, decoded from UTF-8.

=cut

sub tags
{
    my ($self) = @_;

    my $sth = $self->{db}->prepare(q{
        select t.tag, ft.original_spelling
        from tag t
        inner join wc_file_tag ft on ft.tag = t.tag
        where ft.file_id = ?
        order by t.tag
    });
    $sth->execute($self->{id});

    my @tags;
    while (my $row = $sth->fetchrow_hashref) {
        $row->{$_} = decode('UTF-8', $row->{$_}, Encode::FB_CROAK)
            for qw( title short_title description );
        push @tags, { %$row };
    }

    return \@tags;
}

=item $file-E<gt>article_snippet

Return an L<XML::LibXML::Document> object representing the part of
an article which comes before the fold, or before the first page break
(whichever comes first).  If there are no fold markers or page breaks
in the article, returns the complete article content.

=cut

sub article_snippet
{
    my ($self) = @_;
    return $self->{snippet_doc} if exists $self->{snippet_doc};

    my $whole_doc = $self->article_doc;
    $self->_find_content_marks;
    my $fold = $self->{fold};
    return $whole_doc
        unless defined $fold;

    my $snippet_doc = XML::LibXML::Document->new('1.0', 'UTF-8');
    my $body = $snippet_doc->createElementNS('http://www.w3.org/1999/xhtml',
                                             'body');
    $snippet_doc->setDocumentElement($body);

    my $elem = $whole_doc->documentElement->firstChild;
    while (defined $elem && !$elem->isSameNode($fold)) {
        $body->appendChild($elem->cloneNode(1));
        $elem = $elem->nextSibling;
    }

    return $self->{snippet_doc} = $snippet_doc;
}

=item $file-E<gt>article_snippet_html4

Returns a chunk of S<HTML 4> markup for the article's content, just as the
L<article_content_html4() method|/$file-E<gt>article_content_html4([$page_num])>
does, except that this only returns the content up to the fold or first
page break, if the article has any of those.

This also sets an internal flag called C<snippet_is_not_whole_article> to
true if the content returned represents a truncated version of the article's
content (that is, there was a fold mark or page break found).

=cut

sub article_snippet_html4
{
    my ($self) = @_;
    my $snippet_doc = $self->article_snippet;
    $self->{snippet_is_not_whole_article} = 1
        unless $snippet_doc->isSameNode($self->article_doc);

    # This is going to be shown on the homepage or something, so links won't
    # be relative to the output page's URL.
    absolutify_links($snippet_doc, $self->permalink);

    # TODO - this could be more efficient if we passed in the fold position.
    return dom_body_to_html4($snippet_doc);
}

=item $file-E<gt>authors

Returns information about the author or authors credited with creating
the file.  The return value is a reference to an array of zero or more
references to hashes.
Each one contains the following keys:

=over

=item id

The ID number of the entry in the database's C<person> table.

=item username

The username, as specified in the C<daizu:author> property, decoded
into a Perl text string.  Always defined.

=item name

Full name of the author, as a Perl text string.  Always defined.

=item email

Email address as a binary string, or undef.

=item uri

A URL associated with the author, probably their own website, or undef.

=back

The authors are returned in the same order that they are specified in
the C<daizu:author> property.

Note that because of the way the standard property loader works, directories
are not considered to have authors.  If a directory has a C<daizu:author>
property, that will just affect all the files within it.

=cut

sub authors
{
    my ($self) = @_;
    my $db = $self->{db};

    # Build a PostgreSQL regular expression which will be used to select
    # all the 'person_info' records with a path which applies to the file,
    # in order to select the most specific one (with the longest path).
    my @path = map { pgregex_escape($_) }
               split '/', $self->{path};
    my $path_regex = '^(' . join('(/', @path) . '$' . ('|$)' x @path);

    my $sth = $db->prepare(q{
        select person_id
        from file_author
        where file_id = ?
        order by pos
    });
    $sth->execute($self->{id});

    my @author;
    while (my ($id) = $sth->fetchrow_array) {
        my $info = $db->selectrow_hashref(q{
            select p.id, p.username, i.name, i.email, i.uri
            from person p
            inner join person_info i on i.person_id = p.id
            where p.id = ?
              and i.path ~ ?
            order by length(i.path) desc
        }, undef, $id, $path_regex);
        croak "no 'person_info' record for user $id at path '$self->{path}'"
            unless defined $info;
        for (qw( username name )) {
            $info->{$_} = decode('UTF-8', $info->{$_}, Encode::FB_CROAK);
        }



( run in 0.695 second using v1.01-cache-2.11-cpan-39bf76dae61 )