Data-CTable

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

                  mac     "\x0D"        CR      13       "\015"       ^M
                  unix    "\x0A"        LF      10       "\012"       ^J

        See the section LINE ENDINGS, below, for accessor methods and
        conversion utilities that help you get/set this parameter in either
        symbolic format or string format as you prefer.

    _FDelimiter ||= undef;
        _FDelimiter is the field delimiter between field names in the header
        row (if any) and also between fields in the body of the file. If
        undef, read() will try to guess whether it is tab ""\t"" or comma
        <",">, and set this parameter accordingly. If there is only one
        field in the file, then comma is assumed by read() and will be used
        by write().

        To guess the delimiter, the program looks for the first comma or tab
        character in the header row (if present) or in the first record.
        Whichever character is found first is assumed to be the delimiter.

        If you don't want the program to guess, or you have a data file
        format that uses a custom delimiter, specify the delimiter
        explicitly in the object or when calling read() or make a subclass
        that initializes this value differently. On write(), this will
        default to comma if it is empty or undef.

    _QuoteFields = undef unless exists
        _QuoteFields controls how field values are quoted by write() when
        writing the table to a delimited text file.

        An undef value (the default) means "auto" -- each field is checked
        individually and if it contains either the _FDelimiter character or
        a double-quote character, the field value will be surrounded by
        double-quotes as it is written to the file. This method is slower to
        write but faster to read, and may make the output easier for humans
        to read.

        A true value means always put double-quotes around every field
        value. This mode is faster to write but slower to read.

        A zero value means never to use double-quotes around field values
        and not to check for the need to use them. This method is the
        fastest to read and write. You may use it when you are certain that
        your data can't contain any special characters. However, if you're
        wrong, this mode will produce a corrupted file in the event that one
        of the fields does contain the active delimiter (such as comma or
        tab) or a quote.

    _HeaderRow = 1 unless exists
        _HeaderRow is a boolean that says whether to expect a header row in
        data files. The default is true: a header row is required. If false,
        _FieldList MUST be present before calling read() or an error will be
        generated. In this latter case, _FieldList will be assumed to give
        the correct names of the fields in the file, in order, before the
        file is read. In other words, the object expects that either a) it
        can get the field names from the file's header row or b) you will
        supply them before read() opens the file.

  Encoding of return characters within fields

    _ReturnMap = 1 unless exists
        _ReturnMap says that returns embedded in fields should be decoded on
        read() and encoded again on write(). The industry-standard encoding
        for embedded returns is ^K (ascii 11 -- but see next setting to
        change it). This defaults to true but can be turned off if you want
        data untouched by read(). This setting has no effect on data files
        where no fields contain embedded returns. However, it is vital to
        leave this option ON when writing any data file whose fields could
        contain embedded returns -- if you have such data and call write()
        with _ReturnMap turned off, the resulting file will be an invalid
        Merge/CSV file and might not be re-readable.

        When these fields are decoded on read(), encoded returns are
        converted to ""\n"" in memory, whatever its interpretation may be on
        the current platform (\x0A on Unix or DOS; \x0D on MacPerl).

        IMPORTANT NOTE: When these fields are encoded by write(), any
        occurrence of the current _LineEnding being used to write the file
        is searched and encoded FIRST, and THEN, any occurrence of "\n" is
        also searched and encoded. For example, if using mac line endings
        (^M) to write a file on a Unix machine, any ^M characters in fields
        will be encoded, and then any "\n" (^J) characters will ALSO be
        encoded. This may not be what you want, so be sure you know how your
        data is encoded in cases where your field values might contain any
        ^J and/or ^M characters.

        IMPORTANT NOTE: If you turn _ReturnMap off, fields with returns in
        them will still be double-quoted correctly. Some parsers of tab- or
        comma-delimited files are able to support reading such files.
        HOWEVER, the parser in this module's read() method DOES NOT
        currently support reading files in which a single field value
        appears to span multiple lines in the file. If you have a need to
        read such a file, you may need to write your own parser as a
        subclass of this module.

    _ReturnEncoding ||= "\x0B";
        This is the default encoding to assume when embedding return
        characters within fields. The industry standard is "\x0B" (ascii 11
        / octal \013 / ^K) so you should probably not ever change this
        setting.

        When fields are encoded on write(), ""\n"" is converted to this
        value. Note that different platforms use different ascii values for
        ""\n"", which is another good reason to leave the ReturnEncoding
        feature enabled when calling write().

        To summarize: this module likes to assume, and you should too, that
        returns in data files on disk are encoded as "\x0B", but once loaded
        into memory, they are encoded as the current platform's value of
        ""\n"".

    _MacRomanMap = undef unless exists
        Data::CTable assumes by default that you want field data in memory
        to be in the ISO 8859-1 character set (the standard for Latin 1
        Roman characters on Unix and Windows in the English and Western
        European languages -- and also the default encoding for HTML Web
        pages).

        _MacRomanMap controls the module's optional mapping of Roman
        characters from Mac format on disk to ISO format in memory when
        reading and writing data files. These settings are recognized:

                undef   ## Auto: Read/write Mac chars if using Mac line endings  
                1       ## On:   Assume Mac char set in all fields
                0       ## Off:  Don't do any character mapping at all

        The default setting is undef, which enables "Auto" mode: files found
        to contain Mac line endings will be assumed to contain Mac
        upper-ASCII characters and will be mapped to ISO on read(); and
        files to be written with Mac line endings will mapped back from ISO
        to Mac format on write().

        If your data uses any non-Latin-1 character sets, or binary data, or



( run in 3.846 seconds using v1.01-cache-2.11-cpan-2398b32b56e )