XS-libdwarf

 view release on metacpan or  search on metacpan

libdwarf-code-0.11.1/doc/libdwarf.dox  view on Meta::CPAN


    For a simple example of this
    @see jitreader

    But the @e libdwarf feature can be used in a wide variety of ways.

    For example, the DWARF data could be kept in
    simple files of bytes on the internet.  Or on the
    local net. Or if files can be written locally
    each section could be kept in a simple stream
    of bytes in the local file system.

    Another example is a non-standard file system,
    or file format, with the intent of obfuscating
    the file or the DWARF.

    For this to work the code generator must generate
    standard DWARF.

    Overall the idea is a simple one: You write a
    small handful of functions and supply function
    pointers and code implementing the functions.
    These are part of your application or library,
    not part of @e libdwarf.

    You set up a little bit of data with that
    code (all described below) and then you
    have essentially written the dwarf_init_path
    equivalent and you can access compilation units,
    line tables etc and the standard @e libdwarf
    function calls work.

    Data you need to create involves these types.
    What follows describes how to fill them in and
    how to make them work for you.

    @code
    typedef struct Dwarf_Obj_Access_Interface_a_s
        Dwarf_Obj_Access_Interface_a;
    struct Dwarf_Obj_Access_Interface_a_s {
        void*                             ai_object;
        const Dwarf_Obj_Access_Methods_a *ai_methods;
    };

    typedef struct Dwarf_Obj_Access_Methods_a_s
        Dwarf_Obj_Access_Methods_a
    struct Dwarf_Obj_Access_Methods_a_s {
        int    (*om_get_section_info)(void* obj,
            Dwarf_Unsigned section_index,
            Dwarf_Obj_Access_Section_a* return_section,
            int* error);
        Dwarf_Small      (*om_get_byte_order)(void* obj);
        Dwarf_Small      (*om_get_length_size)(void* obj);
        Dwarf_Small      (*om_get_pointer_size)(void* obj);
        Dwarf_Unsigned   (*om_get_filesize)(void* obj);

        Dwarf_Unsigned   (*om_get_section_count)(void* obj);
        int              (*om_load_section)(void* obj,
            Dwarf_Unsigned section_index,
            Dwarf_Small** return_data, int* error);
        int              (*om_relocate_a_section)(void* obj,
            Dwarf_Unsigned section_index,
            Dwarf_Debug dbg,
            int* error);
    };

    typedef struct Dwarf_Obj_Access_Section_a_s
        Dwarf_Obj_Access_Section_a
    struct Dwarf_Obj_Access_Section_a_s {
        const char*    as_name;
        Dwarf_Unsigned as_type;
        Dwarf_Unsigned as_flags;
        Dwarf_Addr     as_addr;
        Dwarf_Unsigned as_offset;
        Dwarf_Unsigned as_size;
        Dwarf_Unsigned as_link;
        Dwarf_Unsigned as_info;
        Dwarf_Unsigned as_addralign;
        Dwarf_Unsigned as_entrysize;
    };
    @endcode

    @b Dwarf_Obj_Access_Section_a:
    Your implementation of a @b om_get_section_info
    must fill in a few fields for @e libdwarf.
    The fields here are
    standard Elf, but for most you can just use
    the value zero.  We assume here you will not be
    doing relocations at runtime.

    @b as_name: Here you set a section name via
    the pointer.  The section names must be names
    as defined in the DWARF standard, so if such do
    not appear in your data you have to create the
    strings yourself.

    @b as_type: Fill in zero.

    @b as_flags: Fill in zero.

    @b as_addr: Fill in the address, in local memory,
    where the bytes of the section are.

    @b as_offset: Fill in zero.

    @b as_size: Fill in the size, in bytes,
    of the section you are telling @e libdwarf about.

    @b as_link: Fill in zero.

    @b as_info: Fill in zero.

    @b as_addralign: Fill in zero.

    @b as_entrysize: Fill in one(1).

    @b Dwarf_Obj_Access_Methods_a_s:
    The functions we need to access object data
    from @e libdwarf are declared here.

    In these function pointer declarations
    'void *obj' is intended to be a pointer (the object field in
    Dwarf_Obj_Access_Interface_s) that hides the
    library-specific and object-specific data that
    makes it possible to handle multiple object
    formats and multiple libraries.  It is not
    required that one handles multiple such in a
    single @e libdwarf archive/shared-library
    (but not ruled out either).  See
    dwarf_elf_object_access_internals_t and
    dwarf_elf_access.c for an example.

    Usually the struct @b Dwarf_Obj_Access_Methods_a_s is
    statically defined
    and the function pointers are set at
    compile time.

    The om_get_filesize member is new September 4, 2021.
    Its position is NOT at the end of the list.
    The member names all now have om_ prefix.

    @section dwsec_sectiongroup Section Groups: Split Dwarf, COMDAT groups

    A typical executable or shared object is unlikely
    to have any section groups, and in that case
    what follows is irrelevant and unimportant.

    @b COMDAT groups are defined by the Elf ABI and 
    enable compilers and linkers

libdwarf-code-0.11.1/doc/libdwarf.dox  view on Meta::CPAN


    @see dwarf_sec_group_sizes
    @see dwarf_sec_group_map

    If an object file has multiple groups
    @e libdwarf will not reveal contents of more
    than the single requested group with a given
    dwarf_init_path() call.
    One must pass in another groupnumber
    to another dwarf_init_path(), meaning initialize
    a new Dwarf_Debug,  to get @e libdwarf to
    access that group.

    When opening a Dwarf_Debug the following applies:

    If DW_GROUPNUMBER_ANY is passed in @e libdwarf will
    choose either of DW_GROUPNUMBER_BASE(1) or
    DW_GROUPNUMBER_DWO (2) depending on the object
    content. If both groups one and two are in the
    object @e libdwarf will chose DW_GROUPNUMBER_BASE.

    If DW_GROUPNUMBER_BASE is passed in @e libdwarf
    will choose it if non-split DWARF is in the object, else
    the init call will return DW_DLV_NO_ENTRY.

    If DW_GROUPNUMBER_DWO is passed in @e libdwarf
    will choose it if .dwo sections are in the object, else
    the init will call return DW_DLV_NO_ENTRY.

    If a groupnumber greater than two is passed in
    @e libdwarf accepts it, whether any sections
    corresponding to that groupnumber exist or not.
    If the groupnumber is not an actual group
    the init will call return DW_DLV_NO_ENTRY.

    For information on groups  "dwarfdump -i"
    on an object file will show all section group
    information @b unless the object file is a simple
    standard object with no .dwo sections and no
    COMDAT groups (in which case the output will be
    silent on groups). Look for <b> Section Groups
    data </b> in the dwarfdump output.  The groups
    information will be appearing very early in the
    dwarfdump output.

    Sections that are part of an Elf COMDAT GROUP are
    assigned a group number > 2.  There can be many
    such COMDAT groups in an object file (but none
    in an executable or shared object).  Each such
    COMDAT group will have a small set of sections
    in it and each section in such a group will be
    assigned the same group number by @e libdwarf.

    Sections that are in a .dwp .dwo object file
    are assigned to DW_GROUPNUMBER_DWO,

    Sections not part of a .dwp package file or
    a.dwo section, or a COMDAT group are assigned
    DW_GROUPNUMBER_BASE.

    At least one compiler relies on relocations to
    identify COMDAT groups, but the compiler authors
    do not publicly document how this works so we
    ignore such (these COMDAT groups will result in
    @e libdwarf returning DW_DLV_ERROR).

    Popular compilers and tools are using such
    sections. There is no detailed documentation that
    we can find (so far) on how the COMDAT section
    groups are used, so @e libdwarf is based on
    observations of what compilers generate.

    @section dwsec_separatedebug  Details on separate DWARF object access

    There are, at present, three distinct approaches
    in use to put DWARF information into separate
    objects to significantly shrink the size of
    the executable. All of them involve identifying
    a separate file.

    Split Dwarf is one method. It defines the attribute
    @b DW_AT_dwo_name (if present) as having 
    a file-system appropriate
    name of the split object with most of the DWARF.

    The second is Macos dSYM.  It is a convention of placing
    the DWARF-containing object (separate from the
    object containing code) in a specific subdirectory
    tree. 

    The third involves GNU debuglink and GNU
    debug_id. These are two distinct ways (outside
    of DWARF) to provide
    names of alternative DWARF-containing objects
    elsewhere in a file system.

    If one initializes a Dwarf_Debug object with
    dwarf_init_path()  or dwarf_init_path_dl()
    appropriately @e libdwarf will automatically
    open the alternate dSYM or
    debuglink/debug_id object on the object with
    most of the DWARF.

    @see https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

    @e libdwarf provides means to automatically read
    the alternate object (in place of the one named
    in the init call) or to suppress that and read
    the named object file.

    @code
    int dwarf_init_path(const char * dw_path,
    char *            dw_true_path_out_buffer,
    unsigned int      dw_true_path_bufferlen,
    unsigned int      dw_groupnumber,
    Dwarf_Handler     dw_errhand,
    Dwarf_Ptr         dw_errarg,
    Dwarf_Debug*      dw_dbg,
    Dwarf_Error*      dw_error);

    int dwarf_init_path_dl(const char *dw_path,

libdwarf-code-0.11.1/doc/libdwarf.dox  view on Meta::CPAN

    section.
    An example of use is in doc/checkexamples.c (see examplev).

    <b>Changes 0.9.2 to 0.10.1</b>

    Released 01 July 2024
    (Release 0.10.0 was missing a CMakeLists.txt file
    and is withdrawn).

    Added API function
    dwarf_get_locdesc_entry_e() to allow dwarfdump
    to report some data from .debug_loclists more
    completely -- it reports a byte length of each
    loclist item. This is of little interest to anyone,
    surely.  dwarf_get_locdesc_entry_d() is still
    what you should be using.

    dwarf_debug_addr_table() now supports reading
    the DWARF4 GNU extension .debug_addr table.

    A heuristic sanity check for PE object files was too conservative
    in limiting VirtualSize to 200MB.  A library user has
    an exe with .debug_info size of over 200MB.
    Increased the limit to be 2000MB and changed the names of the
    errors for the three heuristic checks to include _HEURISTIC_ so it is
    easier to know the kind of error/failure it is.

    When doing a shared-library build with cmake we were not emitting
    the correct .so version names nor setting SONAME with the
    correct version name. This long-standing mistake is now fixed.

    <b>Changes 0.9.1 to 0.9.2</b>

    Version 0.9.2 released 2 April 2024

    Vulnerabilities DW202402-001, DW202402-002,DW202402-003,
    and DW202403-001 could crash @e libdwarf given
    a carefully corrupted (fuzzed) DWARF object file.
    Now the library returns an error 
    for these corruptions.
    DW_CFA_high_user (in dwarf.h) was a misspelling.
    Added the correct spelling DW_CFA_hi_user and
    a comment on the incorrect spelling.

    <b>Changes 0.9.0 to 0.9.1</b>

    Version 0.9.1 released 27 January 2024

    The abbreviation code type returned by
    dwarf_die_abbrev_code() changed from <b>int</b>
    to <b>Dwarf_Unsigned</b> as abbrev codes are
    not constrained by the DWARF Standard.

    The section count returned by dwarf_get_section_count()
    is now of type <b>Dwarf_Unsigned</b>. The previous type
    of <b>int</b> never made sense in @e libdwarf.
    Callers will, in practice, see the same value as before.

    All type-warnings issued by MSVC have been fixed.

    Problems reading Macho (Apple) relocatable
    object files have been fixed.

    Each of the build systems available now has an option
    which eliminates @e libdwarf references to the
    object section decompression libraries.
    See the respective READMEs.

    <b>Changes 0.8.0 to 0.9.0</b>

    Version 0.9.0 released 8 December 2023

    Adding functions (rarely needed) for callers
    with special requirements.
    Added dwarf_get_section_info_by_name_a() and
    dwarf_get_section_info_by_index_a() which add
    dw_section_flags pointer argument to return
    the object section file flags (whose meaning
    depends entirely on the object file format),
    and dw_section_offset pointer argument to return
    the object-relevant offset of the section
    (here too the meaning depends on the object format).
    Also added  dwarf_machine_architecture() which returns
    a few top level data items about the object
    @e libdwarf has opened, including the 'machine' and 'flags'
    from object headers (all supported object types).

    This adds new library functions
    dwarf_next_cu_header_e()
    and dwarf_siblingof_c().
    Used exactly as documented dwarf_next_cu_header_d()
    and  dwarf_siblingof_b() work fine and continue to
    be supported for the forseeable future. However
    it would be easy to misuse as the requirement that
    dwarf_siblingof_b() be called immediately after
    a successful call to dwarf_next_cu_header_d()
    was never stated and that dependency was impossible
    to enforce. The dependency was an API mistake
    made in 1992.

    So dwarf_next_cu_header_e() now returns the
    compilation-unit DIE as well as header
    data and dwarf_siblingof_c() is not needed
    except to traverse sibling DIEs.
    (the compilation-unit DIE by definition has no siblings).

    Changes were required to support Mach-O (Apple)
    universal binaries,
    which were not readable by earlier versions of the library.

    We have new library functions
    dwarf_init_path_a(),
    dwarf_init_path_dl_a(), and
    dwarf_get_universalbinary_count().

    The first two allow a caller to specify which
    (numbering from zero) object file to
    report on by adding a new argument dw_universalnumber.
    Passing zero as the dw_universalnumber argument
    is always safe.

libdwarf-code-0.11.1/doc/libdwarf.dox  view on Meta::CPAN

    as a default since there is no dw_universalnumber
    argument possible.

    For improved performance in reading Fde data
    when iterating though all usable pc values
    we add dwarf_get_fde_info_for_all_regs3_b(), which
    returns the next pc value with actual frame data.
    We retain dwarf_get_fde_info_for_all_regs3() so
    existing code need not change.

    <b>Changes 0.7.0 to 0.8.0</b>

    v0.8.0 released 2023-09-20

    New functions dwarf_get_fde_info_for_reg3_c(),
    dwarf_get_fde_info_for_cfa_reg3_c() are defined.
    The advantage of the new versions is they correctly
    type the dw_offset argument return value
    as Dwarf_Signed instead of the earlier and incorrect type
    Dwarf_Unsigned.

    The original functions dwarf_get_fde_info_for_reg3_b() and
    dwarf_get_fde_info_for_cfa_reg3_b()
    continue to exist and work for compatibility with
    the previous release.

    For all open() calls for which the O_CLOEXEC flag exists
    we now add that flag to the open() call.

    Vulnerabilities involving reading
    corrupt object files (created by fuzzing)
    have been fixed:
    DW202308-001 (ossfuzz 59576),
    DW202307-001 (ossfuzz 60506),
    DW202306-011 (ossfuzz 59950),
    DW202306-009 (ossfuzz 59755),
    DW202306-006 (ossfuzz 59727),
    DW202306-005 (ossfuzz 59717),
    DW202306-004 (ossfuzz 59695),
    DW202306-002 (ossfuzz 59519),
    DW202306-001 (ossfuzz 59597).
    DW202305-010 (ossfuzz 59478).
    DW202305-009 (ossfuzz 56451).
    DW202305-008 (ossfuzz 56451),
    DW202305-007 (ossfuzz 56474),
    DW202305-006 (ossfuzz 56472),
    DW202305-005 (ossfuzz 56462),
    DW202305-004 (ossfuzz 56446).

    <b>Changes 0.6.0 to 0.7.0</b>

    v0.7.0 released 2023-05-20

    Elf section counts can exceed 16 bits
    (on linux see <b>man 5 elf</b>)
    so some function prototype members
    of struct <b>Dwarf_Obj_Access_Methods_a_s</b>
    changed.
    Specifically, om_get_section_info()
    om_load_section(), and
    om_relocate_a_section()
    now pass section indexes as Dwarf_Unsigned
    instead of Dwarf_Half.
    Without this change executables/objects
    with more than 64K sections cannot
    be read by @e libdwarf.  This is unlikely
    to affect your code since for most users
    @e libdwarf takes care of this and dwarfdump
    is aware of this change.

    Two functions have been removed from libdwarf.h
    and the library: dwarf_dnames_abbrev_by_code()
    and dwarf_dnames_abbrev_form_by_index().

    dwarf_dnames_abbrev_by_code() is slow and pointless.
    Use either dwarf_dnames_name() or
    dwarf_dnames_abbrevtable() instead, depending
    on what you want to accomplish.

    dwarf_dnames_abbrev_form_by_index() is not needed,
    was difficult to call due to argument list
    requirements, and never worked.

    <b>Changes 0.5.0 to 0.6.0</b>

    v0.6.0 released 2023-02-20
    The dealloc required by dwarf_offset_list()
    was wrong. The call could crash @e libdwarf
    on systems with 32bit pointers.
    The new and proper dealloc (for all
    pointer sizes) is
    dwarf_dealloc(dbg,offsetlistptr,DW_DLA_UARRAY);

    A memory leak from dwarf_load_loclists()
    and dwarf_load_rnglists() is fixed and the
    libdwarf-regressiontests error that hid the leak
    has also been fixed.

    A <b>compatibility</b> change affects callers of
    dwarf_dietype_offset(), which on success returns
    the offset of the target of the DW_AT_type attribute
    (if such exists in the Dwarf_Die).  Added a pointer
    argument so the function can (when
    appropriate) return a FALSE argument
    indicating the offset refers to DWARF4 .debug_types
    section, rather than TRUE value when .debug_info
    is the section the offset refers to.
    If anyone was using this function it would fail
    badly (while pretending success)
    with a DWARF4 DW_FORM_ref_sig8 on a DW_AT_type
    attribute from the Dwarf_Die argument.  One will likely
    encounter DWARF4 content so a single correct function
    seemed necessary. New regression tests will ensure
    this will continue to work.

    A <b>compatibility</b> change affects callers of
    dwarf_get_pubtypes().  If an application reads
    .debug_pubtypes there is a <b>compatibility
    break</b>. Such applications must be recompiled
    with latest @e libdwarf, change Dwarf_Type
    declarations to use Dwarf_Global, and can only



( run in 0.611 second using v1.01-cache-2.11-cpan-5511b514fd6 )