Embedix-ECD

 view release on metacpan or  search on metacpan

ECD.pm  view on Meta::CPAN


    $opt{indent} = 0 unless(defined($opt{indent}));
    $opt{sw}     = $opt{shiftwidth} || $default{shiftwidth};
    $opt{space}  = " " x $opt{indent} . indent($self->getDepth, $opt{sw});
    $opt{space2} = $opt{space} . " " x $opt{sw};
    $opt{order}  = \@attribute_order;

    return \%opt;
}

# render the attributes of a node
# It's rare for me to nest this much.
#_______________________________________
sub attributeToString {
    my $self = shift;
    my $opt  = shift;
    my ($sw, $space, $space2) = map { $opt->{$_} } qw(sw space space2);
    my $a;
    return join '', map {
        $a = $self->getAttribute($_);
        if (defined($a)) {
            if (ref($a)) {
                if (scalar(@$a)) {
                    # an aggregate attribute
                    $space2 . "<" . uc($_) . ">\n" .
                    join('', map { $space2 . " " x $sw . "$_\n" } @$a) .
                    $space2 . "</" . uc($_) . ">\n";
                } else {
                    # an empty aggregate attribute
                    "";
                }
            } else {
                # a scalar attribute
                $space2 . uc($_) . "=" . "$a\n";
            }
        }
    } @{$opt->{order}};
}

# render $self in ECD format
# Embedix::ECD itself doesn't have a textual representation 
# but its subclasses should.
#_______________________________________
sub toString {
    my $self = shift;
    return join('', map { $_->toString(@_) } $self->getChildren());
}

1;

__END__

=head1 NAME

Embedix::ECD - Embedix Component Descriptions as objects

=head1 SYNOPSIS

instantiate from a file

    my $ecd       = Embedix::ECD->newFromFile('busybox.ecd');
    my $other_ecd = Embedix::ECD->newFromFile('tinylogin.ecd');

access nodes

    my $busybox = $ecd->System->Utilities->busybox;

build from scratch

    my $server = Embedix::ECD::Group->new(name => 'Server');
    my $www    = Embedix::ECD::Group->new(name => 'WWW');
    my $apache = Embedix::ECD::Component->new (
        name   => 'apache',
        srpm   => 'apache',
        prompt => 'Include apache web server?',
        help   => 'The most popular http server on the internet',
    );
    $ecd->addChild($server);
    $ecd->Server->addChild($www);
    $ecd->Server->WWW->addChild($apache);

get/set attributes

    my $srpm = $busybox->srpm();

    $busybox->help('i am busybox of borg -- unix will be assimilated.');

    $busybox->requires([
        'libc.so.6',
        'ld-linux.so.2',
        'skellinux',
    ]);

combine Embedix::ECD objects together

    $ecd->mergeWith($other_ecd);

print as text

    print $ecd->toString;

print as XML

    use Embedix::ECD::XMLv1 qw(xml_from_cons);

    print $ecd->toXML(shiftwidth => 4, dtd => 'yes');

    my $cons = Embedix::ECD->consFromFile('minicom.ecd');
    print xml_from_cons($cons);

=head1 REQUIRES

=over 4

=item Parse::RecDescent

for the ECD parser

=item Data::Dumper

for debugging

=item Tie::IxHash

for preserving the insertion order of children while retaining
C<O(1)> named access (at the expense of memory).

=item Pod::Usage

C<bin/ecd2xml> uses this to generate its help message.

=back

=head1 DESCRIPTION

Embedix::ECD allows one to represent ECD files as a tree of perl
objects.  One can construct objects by parsing an ECD file, or one can
build an ECD object from scratch by combining instances of Embedix::ECD
and its subclasses.  These objects can then be turned back into ECD
files via the C<toString()> method.

ECD stands for Embedix Component Description, and its purpose is to
contain meta-data regarding packages (aka components) in the Embedix
distribution.  ECD files contain much of the same data a .spec file does
for an RPM.  A major difference however is that ECD files do not contain
building instructions whereas .spec files do.  Another major difference
between .spec files and ECD files is the structure.  ECD files are
hierarchically structured whereas .spec files are comparatively flat.

The ECD format reminds me of the syntax for Apache configuration files.
Items are tag-delimited (like in XML) and attributes are found between
these tags.  Comments are written by prefixing them with /^\s*#/.
Unlike apache configurations, attribute names and values are separated
by an "=" sign, whereas in apache the first token is the attribute name
and everything after that (sans leading whitespace) and up to the end of
the line is the attribute's value.  Also, unlike apache configurations,
attributes may also be enclosed in tags, whereas in apache tags are used
only to describe nodes.  

ECD files look like pseudo-XML with shell-styled comments.

=head1 METHODS

=head2 Constructors

There are two types of constructors provided by this class.  The first
kind of constructor begins with "new" and returns an Embedix::ECD
object.  There is another kind of constructor that begins with "cons"
and returns the syntax tree as nested arrayrefs.

I realized that creating an object of the syntax tree takes a long time
(especially for long ECD files).  I also realized that sometimes, the
simple nested arrayref is useful enough on its own.  It also has the
nice property of retaining comments whereas the object constructor
disposes of comments.  I thought if ECD files were ever to be translated
into XML, it'd be nice to be able to keep the comments.  These factors
convinced me that it would be useful to have these 2 kinds of
constructors.

=over 4

=item new(key => $value, ...)

This returns an Embedix::ECD object.  It can be initialized with named
parameters which represent the attributes the object should have.  The
set of valid attributes is described under L</Attributes>.

    $system     = Embedix::ECD::Group->new(name => 'System');
    $utilities  = Embedix::ECD::Group->new(name => 'Utilities');
    $busybox    = Embedix::ECD::Component->new(

        name    => 'busybox',
        type    => 'bool',
        value   => 0,
        srpm    => 'busybox',

        static_size     => 3006,
        min_dynamic_size=> 0,
        storage_size    => 4408,
        startup_time    => 0,

        keeplist        => [ '/bin/busybox' ],
        requires_expr   => [
            '(libc.so.6 == "y") &&',
            '(ld-linux.so.2 == "y") &&',
            '(skellinux == "y") &&',
            '(  (Misc-utilities == "y")',
            '|| (File-compression-utilities == "y")',
            '|| (Network-utilities == "y")',
            '|| (Process-utilities == "y")',
            '|| (Directory-utilities == "y")',
            '|| (User-info-utilities == "y")',
            '|| (Disk-info-utilities == "y")',
            '|| (Screen-utilities == "y")',
            '|| (System-utilities == "y")',
            '|| (File-manipulation-utilities == "y") )',
        ],
                            
    );

=back

The following 5 constructors rely on a Parse::RecDescent parser.  When
they encounter a syntax error they will C<die>, so be sure to wrap them
around an C<eval> block.

=over 4

=item newFromCons($cons)

This returns an Embedix::ECD object from a nested arrayref.

    $ecd = Embedix::ECD->newFromCons($cons)

=item newFromString($string)

This returns an Embedix::ECD object from a string in ECD format.

    $ecd = Embedix::ECD->newFromString($string)

=item newFromFile($filename)

This returns an Embedix::ECD object from an ECD file.

    $ecd = Embedix::ECD->newFromFile($filename)

=item consFromString($string)

This returns a nested arrayref from a string in ECD format.

    $cons = Embedix::ECD->consFromString($string)

=item consFromFile($filename)

This returns a nested arrayref from an ECD file.

    $cons = Embedix::ECD->consFromFile($filename)

=back

(This next constructor is an anomaly.)

ECD.pm  view on Meta::CPAN

        LET ( $VALUE = "n" ) )
    </IF>

The distinction is in the opening tag.  The autovar has a second string
in it which represents the node's name whereas the if has nothing which
means that it is an attribute of the node it is contained in.

There are 5 (not 4) types of nodes.

=over 4

=item the root node | Embedix::ECD

This node is implicit but very real.  When invoking any of the
constructors that begin with "newFrom", one will get back an
Embedix::ECD object within which the rest of the ECD data will be
contained.

=item Group | Embedix::ECD::Group

Their purpose is to establish a hierarchy of components under meaningful
subheadings such as "Server/WWW" or "System/Utilities".  Their main use
is as containers of other nodes.

=item Component | Embedix::ECD::Component

A component node represents a package in the Embedix distribution.

=item Option | Embedix::ECD::Option

An option node is almost always contained under a component node.  The
purpose of an option is to provide a point of configurability for a
package.

=item Autovar | Embedix::ECD::Autovar

What exactly is this?

=back

=head2 Accessing Child Nodes

The following are accessor methods for child nodes.

=over 4

=item getChild($name)

This returns a child node with the given $name or undef if no
such child exists.

    $child_ecd = $ecd->getChild($name)

=item n($name)

C<n()> is an alias for C<getChild()>.  "n" stands for "node" and is a
lot easier to type than "getChild".

    $ecd->n('System')
        ->n('Utilities')
        ->n('busybox')
        ->n('long-ass-option-name-with-redundant-information');

=item addChild($obj)

This adds a child to the current node.

    $ecd->addChild($obj)

=item delChild($obj) or delChild($name)

This deletes a child from the current node.
The child may either be specified by an object or by its name.

    $ecd->delChild($obj) or $ecd->delChild($name)

=item getChildren

This returns a list of all child nodes.

    @child_ecd = $ecd->getChildren()

=item hasChildren

This returns true if the current node has child nodes.

    $ecd->hasChildren()

=back

=head2 Accessing Child Nodes via AUTOLOAD

The name of a node can be used as a method.  This is what makes it
possible to say something like:

    my $busybox = $ecd->System->Utilities->busybox;

and get back the Embedix::ECD::Component object that contains the
information for the busybox package.  "System", "Utilities", and
"busybox" are not predefined methods in Embedix::ECD or any of its
subclasses, so they are delegated to the AUTOLOAD method.  The AUTOLOAD
method will try to find a child with the same name as the undefined
method and it will return it if found.

I have not yet decided whether the AUTOLOAD should die when a child is
not found.  Currently undef is returned in this situation.

One annoyance is that many nodes have names with "-" in them.  These
cannot be AUTOLOADed, because method names may not have a "-" in perl.
When accessing such nodes, use the C<getChild()> method.

=head2 Attributes

If nodes are objects, then attributes are a node's instance variables.
All attributes may be single-valued or aggregate-valued.  Single-valued
attributes are non-reference scalar values, and aggregate attributes are
non-reference scalar values enclosed within an arrayref.

A single valued attribute:

    my $bbsed = $busybox->n('Misc-utilities')->n('keep-bb-sed');
    $bbsed->provides('sed');

The same attribute as an aggregate:

    $bbsed->provides([ 'sed' ]);

Semantically, these are equivalent.  The main difference one will notice
is cosmetic.  When the C<toString()> method is called, the single-valued
one will look like:

    PROVIDES=sed

and the aggregate valued provides will look like:

    <PROVIDES>
        sed
    </PROVIDES>

Again, these two expressions mean the same thing.  An aggregate of one
is interpreted just as if it were a single value.

Aggregates become useful when attributes needs to have a list of values.

    $busybox->n('compile-time-features')->n('enable-bb-feature-use-inittab')->requires ([
        'keep-bb-init',
        'inittab',
        '/bin/sh',
    ]);

This will be rendered by C<toString()> as

    <REQUIRES>
        keep-bb-init
        inittab
        /bin/sh
    </REQUIRES>

There are accessors for attributes that work like your typical perl
getters and setters.  That is, when called without a parameter, the
method behaves as a getter.  When called I<with> a parameter, the method
behaves as a setter and the value of the parameter is assigned to the
attribute.

getter:

    my $name = $busybox->name();

setter:

    $busybox->name('busybox');

=head2 Accessors For Single-Valued Attributes

These are accessors for attributes that are typically single-valued.

=over 4

=item name

This is the name of the node.

    $ecd->name()

=item type

This is the type of the node.  This is usually (always?) seen in the
context of an option and it can contain values such as "bool", "int",
"int.hex", "string", and "tridep".

    $ecd->type()

=item value

This is the value of a node which must be something appropriate for its
type.

    $ecd->value()

=item default_value

This is the value taken by the node if value is not defined.

    $ecd->default_value()

=item range

For the numerical types, it may be desirable to limit the range of
values that may be assigned such that C<value()> will always be
meaningful.  The use of this attribute has only been observed in
linux.ecd.

    $ecd->range()

=item help

This often contains prose regarding the current node.  I think it would
be nice if it were possible to use an alternative form of mark-up
language inside these sections.  (HTML, for instance).

    $ecd->help()

=item prompt

The value in prompt is used in TargetWizard to pose a question to the
user regarding whether he/she wants to enable an option or not.

    $ecd->prompt()

=item license

This is the license that the software falls under.  Usually, there is
only one license, but once in a while you may get a dual-licensed
package.  In that case, it's OK to give it an arrayref with multiple
licenses in it.

    $ecd->license()

=item srpm

This contains the name of the source RPM sans version information and
the file extension.  This attribute almost always has the same value as
C<name()>.

    $ecd->srpm()

=item specpatch

This attribute is only meaningful within the context of a component.
Specpatches are applied to .spec files just prior to the building of a
component.  They are often used to configure the compilation of a
component.  The busybox package provides a good example of this in
action.

    $ecd->specpatch()

=item static_size

This is the sum of .text, .data, and .bss for an option and/or component.

    $ecd->static_size()

=item eval_static_size

If static_size contains a mathematical expression, this method
evaluates it.

    ($size, $give_or_take) = $ecd->eval_static_size;

=item min_dynamic_size

The very least a program will C<malloc()> during its execution.

    $ecd->min_dynamic_size()

=item eval_min_dynamic_size

If min_dynamic_size contains a mathematical expression, this method
evaluates it.

    ($size, $give_or_take) = $ecd->eval_min_dynamic_size;

=item storage_size

This is the amount of space this component and/or option would consume on
a filesystem.

    $ecd->storage_size()

=item eval_storage_size

If storage_size contains a mathematical expression, this method
evaluates it.

    ($size, $give_or_take) = $ecd->eval_storage_size;

=item startup_time

The amount of time (in what metric?) from the time a program is executed
up to the point in time when the program becomes useful.

    $ecd->startup_time()

=item eval_startup_time

If startup_time contains a mathematical expression, this method
evaluates it.

    ($size, $give_or_take) = $ecd->eval_startup_time;

=back

=head2 Accessing Aggregate Attributes

The following are attributes that frequently contain aggregate values.
When setting attributes with aggregate values, enclose the values within
an arrayref.

=over 4

=item requiresexpr

This contains a C-like expression describing node dependencies.

    $ecd->requiresexpr()

=item if

I didn't know if using a keyword as a method name would be legal, but
apparently it is.  I also wonder if more than on 'if' statement is
allowed per node.

    $ecd->if()

=item conflicts

This is used to explicitly specify a node that conflicts with the
current node.  My first thought is that this is just another way
to say C<provides>.

    $ecd->conflicts()

=item build_vars

This specifies a list of transformations that can be applied to a .spec
file prior to building.

    $ecd->build_vars()

=item provides

This is a list of symbolic names that a node is said to be able to
provide.  For example, grep in busybox provides grep.  GNU/grep also
provides grep.  According to TargetWizard, these two cannot coexist on
the same instance of an Embedix distribution, because they both provide
grep.

    $ecd->provides()

=item requires

This is a list of libraries, files, provides, and other nodes required
by the current node.

    $ecd->requires()

=item keeplist

This is a list of files and directories provided by a component or
option.

    $ecd->keeplist()

=item choicelist

This is used for options in the kernel.

    $ecd->choicelist()

=item trideps

This is used for options in the kernel.  

    $ecd->trideps()

=back

=head2 Accessors That Take Named Attributes

The most general kind of accessor takes the name of an attribute as a
parameter and gets or sets it.

=over 4

=item getAttribute($name)

This gets the attribute called $name.

    $val = $ecd->getAttribute($name)

=item setAttribute($name, $value)

This sets the attribute called $name to $value.

    $ecd->setAttribute($name, $value)

=back

=head2 Utility Methods

=over 4

=item toString(indent => 0, shiftwidth => 4)

ECD.pm  view on Meta::CPAN


=item getNodeClass

This returns the node class (ie. Group, Component, Option, or Autovar) of
an Embedix::ECD object.  It differs from the B<ref()> operator in that 
the string "Embedix::ECD::" is omitted from the returned value.

    $name = $ecd->getNodeClass()

=item getFormatOptions(@opt);

This is used internally by implementations of C<toString()> to compute
and return spacing information based on the formatting parameters passed
to it.

    $opt_hash_ref = $ecd->getFormatOptions(@opt);

=item attributeToString($opt_hash_ref);

This is used internally by implementations of C<toString()> to render a
node's attributes.

    $string = $ecd->attributeToString($opt_hash_ref);

=back

=head1 CLASS VARIABLES

You shouldn't be touching these.  This is just here for your
information.

=over 4

=item Embedix::ECD::__grammar

This scalar contains the grammar for ECD files.

=item Embedix::ECD::__parser

This contains an instance of Parse::RecDescent.

=back

=head1 DIAGNOSTICS

=over 4

=item $line: was expecting $TAGNAME, but found $CRAP instead.

This error occurs whenever an imbalanced tag is found.

=item $line: $ATTRIBUTE not allowed in $NODE_TYPE

not implemented

=back

=head1 BUGS

This parser becomes exponentially slower as the size of ECD data
increases.  busybox.ecd takes 30 seconds to parse.
Don't even try to parse linux.ecd -- it will sit there for hours
just sucking CPU before it ultimately fails and gives you back
nothing.  I don't know if there's anything I can do about it.

I have noticed that XML::Parser (which wraps around the C library,
expat) is 60 times faster than my Parse::RecDescent-based parser
when reading busybox.ecd.  I really want to take advantage of this.

=head1 COPYRIGHT

Copyright (c) 2000,2001 John BEPPU.  All rights reserved.  This program is
free software; you can redistribute it and/or modify it under the same
terms as Perl itself.

=head1 AUTHOR

John BEPPU <beppu@lineo.com>

=head1 SEE ALSO

=over 4

=item related libraries and programs

C<ecdlib.py(3)>, C<config2ecd(1)>, C<tw(1)>

=item related perl modules

Embedix::ECD::XMLv1(3pm)

=item CML2

The Configuration Menu Language is a constraint-based language
developed by Eric Raymond in an attempt to simplify the process of
configuring the Linux kernel.

    http://www.tuxedo.org/~esr/kbuild/

=item CDL

The Component Description Language was developed by Cygnus to support
configurable compilation for the eCos operating system.

    http://sourceware.cygnus.com/ecos/

=item the lastest version

    http://opensource.lineo.com/cgi-bin/cvsweb/pm/Embedix/ECD/

=back

=cut

# $Id: ECD.pm,v 1.9 2001/02/21 21:04:58 beppu Exp $



( run in 1.338 second using v1.01-cache-2.11-cpan-d8267643d1d )