XS-libdwarf
view release on metacpan or search on metacpan
libdwarf-code-0.11.1/doc/libdwarf.dox view on Meta::CPAN
For a simple example of this
@see jitreader
But the @e libdwarf feature can be used in a wide variety of ways.
For example, the DWARF data could be kept in
simple files of bytes on the internet. Or on the
local net. Or if files can be written locally
each section could be kept in a simple stream
of bytes in the local file system.
Another example is a non-standard file system,
or file format, with the intent of obfuscating
the file or the DWARF.
For this to work the code generator must generate
standard DWARF.
Overall the idea is a simple one: You write a
small handful of functions and supply function
pointers and code implementing the functions.
These are part of your application or library,
not part of @e libdwarf.
You set up a little bit of data with that
code (all described below) and then you
have essentially written the dwarf_init_path
equivalent and you can access compilation units,
line tables etc and the standard @e libdwarf
function calls work.
Data you need to create involves these types.
What follows describes how to fill them in and
how to make them work for you.
@code
typedef struct Dwarf_Obj_Access_Interface_a_s
Dwarf_Obj_Access_Interface_a;
struct Dwarf_Obj_Access_Interface_a_s {
void* ai_object;
const Dwarf_Obj_Access_Methods_a *ai_methods;
};
typedef struct Dwarf_Obj_Access_Methods_a_s
Dwarf_Obj_Access_Methods_a
struct Dwarf_Obj_Access_Methods_a_s {
int (*om_get_section_info)(void* obj,
Dwarf_Unsigned section_index,
Dwarf_Obj_Access_Section_a* return_section,
int* error);
Dwarf_Small (*om_get_byte_order)(void* obj);
Dwarf_Small (*om_get_length_size)(void* obj);
Dwarf_Small (*om_get_pointer_size)(void* obj);
Dwarf_Unsigned (*om_get_filesize)(void* obj);
Dwarf_Unsigned (*om_get_section_count)(void* obj);
int (*om_load_section)(void* obj,
Dwarf_Unsigned section_index,
Dwarf_Small** return_data, int* error);
int (*om_relocate_a_section)(void* obj,
Dwarf_Unsigned section_index,
Dwarf_Debug dbg,
int* error);
};
typedef struct Dwarf_Obj_Access_Section_a_s
Dwarf_Obj_Access_Section_a
struct Dwarf_Obj_Access_Section_a_s {
const char* as_name;
Dwarf_Unsigned as_type;
Dwarf_Unsigned as_flags;
Dwarf_Addr as_addr;
Dwarf_Unsigned as_offset;
Dwarf_Unsigned as_size;
Dwarf_Unsigned as_link;
Dwarf_Unsigned as_info;
Dwarf_Unsigned as_addralign;
Dwarf_Unsigned as_entrysize;
};
@endcode
@b Dwarf_Obj_Access_Section_a:
Your implementation of a @b om_get_section_info
must fill in a few fields for @e libdwarf.
The fields here are
standard Elf, but for most you can just use
the value zero. We assume here you will not be
doing relocations at runtime.
@b as_name: Here you set a section name via
the pointer. The section names must be names
as defined in the DWARF standard, so if such do
not appear in your data you have to create the
strings yourself.
@b as_type: Fill in zero.
@b as_flags: Fill in zero.
@b as_addr: Fill in the address, in local memory,
where the bytes of the section are.
@b as_offset: Fill in zero.
@b as_size: Fill in the size, in bytes,
of the section you are telling @e libdwarf about.
@b as_link: Fill in zero.
@b as_info: Fill in zero.
@b as_addralign: Fill in zero.
@b as_entrysize: Fill in one(1).
@b Dwarf_Obj_Access_Methods_a_s:
The functions we need to access object data
from @e libdwarf are declared here.
In these function pointer declarations
'void *obj' is intended to be a pointer (the object field in
Dwarf_Obj_Access_Interface_s) that hides the
library-specific and object-specific data that
makes it possible to handle multiple object
formats and multiple libraries. It is not
required that one handles multiple such in a
single @e libdwarf archive/shared-library
(but not ruled out either). See
dwarf_elf_object_access_internals_t and
dwarf_elf_access.c for an example.
Usually the struct @b Dwarf_Obj_Access_Methods_a_s is
statically defined
and the function pointers are set at
compile time.
The om_get_filesize member is new September 4, 2021.
Its position is NOT at the end of the list.
The member names all now have om_ prefix.
@section dwsec_sectiongroup Section Groups: Split Dwarf, COMDAT groups
A typical executable or shared object is unlikely
to have any section groups, and in that case
what follows is irrelevant and unimportant.
@b COMDAT groups are defined by the Elf ABI and
enable compilers and linkers
libdwarf-code-0.11.1/doc/libdwarf.dox view on Meta::CPAN
@see dwarf_sec_group_sizes
@see dwarf_sec_group_map
If an object file has multiple groups
@e libdwarf will not reveal contents of more
than the single requested group with a given
dwarf_init_path() call.
One must pass in another groupnumber
to another dwarf_init_path(), meaning initialize
a new Dwarf_Debug, to get @e libdwarf to
access that group.
When opening a Dwarf_Debug the following applies:
If DW_GROUPNUMBER_ANY is passed in @e libdwarf will
choose either of DW_GROUPNUMBER_BASE(1) or
DW_GROUPNUMBER_DWO (2) depending on the object
content. If both groups one and two are in the
object @e libdwarf will chose DW_GROUPNUMBER_BASE.
If DW_GROUPNUMBER_BASE is passed in @e libdwarf
will choose it if non-split DWARF is in the object, else
the init call will return DW_DLV_NO_ENTRY.
If DW_GROUPNUMBER_DWO is passed in @e libdwarf
will choose it if .dwo sections are in the object, else
the init will call return DW_DLV_NO_ENTRY.
If a groupnumber greater than two is passed in
@e libdwarf accepts it, whether any sections
corresponding to that groupnumber exist or not.
If the groupnumber is not an actual group
the init will call return DW_DLV_NO_ENTRY.
For information on groups "dwarfdump -i"
on an object file will show all section group
information @b unless the object file is a simple
standard object with no .dwo sections and no
COMDAT groups (in which case the output will be
silent on groups). Look for <b> Section Groups
data </b> in the dwarfdump output. The groups
information will be appearing very early in the
dwarfdump output.
Sections that are part of an Elf COMDAT GROUP are
assigned a group number > 2. There can be many
such COMDAT groups in an object file (but none
in an executable or shared object). Each such
COMDAT group will have a small set of sections
in it and each section in such a group will be
assigned the same group number by @e libdwarf.
Sections that are in a .dwp .dwo object file
are assigned to DW_GROUPNUMBER_DWO,
Sections not part of a .dwp package file or
a.dwo section, or a COMDAT group are assigned
DW_GROUPNUMBER_BASE.
At least one compiler relies on relocations to
identify COMDAT groups, but the compiler authors
do not publicly document how this works so we
ignore such (these COMDAT groups will result in
@e libdwarf returning DW_DLV_ERROR).
Popular compilers and tools are using such
sections. There is no detailed documentation that
we can find (so far) on how the COMDAT section
groups are used, so @e libdwarf is based on
observations of what compilers generate.
@section dwsec_separatedebug Details on separate DWARF object access
There are, at present, three distinct approaches
in use to put DWARF information into separate
objects to significantly shrink the size of
the executable. All of them involve identifying
a separate file.
Split Dwarf is one method. It defines the attribute
@b DW_AT_dwo_name (if present) as having
a file-system appropriate
name of the split object with most of the DWARF.
The second is Macos dSYM. It is a convention of placing
the DWARF-containing object (separate from the
object containing code) in a specific subdirectory
tree.
The third involves GNU debuglink and GNU
debug_id. These are two distinct ways (outside
of DWARF) to provide
names of alternative DWARF-containing objects
elsewhere in a file system.
If one initializes a Dwarf_Debug object with
dwarf_init_path() or dwarf_init_path_dl()
appropriately @e libdwarf will automatically
open the alternate dSYM or
debuglink/debug_id object on the object with
most of the DWARF.
@see https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
@e libdwarf provides means to automatically read
the alternate object (in place of the one named
in the init call) or to suppress that and read
the named object file.
@code
int dwarf_init_path(const char * dw_path,
char * dw_true_path_out_buffer,
unsigned int dw_true_path_bufferlen,
unsigned int dw_groupnumber,
Dwarf_Handler dw_errhand,
Dwarf_Ptr dw_errarg,
Dwarf_Debug* dw_dbg,
Dwarf_Error* dw_error);
int dwarf_init_path_dl(const char *dw_path,
libdwarf-code-0.11.1/doc/libdwarf.dox view on Meta::CPAN
section.
An example of use is in doc/checkexamples.c (see examplev).
<b>Changes 0.9.2 to 0.10.1</b>
Released 01 July 2024
(Release 0.10.0 was missing a CMakeLists.txt file
and is withdrawn).
Added API function
dwarf_get_locdesc_entry_e() to allow dwarfdump
to report some data from .debug_loclists more
completely -- it reports a byte length of each
loclist item. This is of little interest to anyone,
surely. dwarf_get_locdesc_entry_d() is still
what you should be using.
dwarf_debug_addr_table() now supports reading
the DWARF4 GNU extension .debug_addr table.
A heuristic sanity check for PE object files was too conservative
in limiting VirtualSize to 200MB. A library user has
an exe with .debug_info size of over 200MB.
Increased the limit to be 2000MB and changed the names of the
errors for the three heuristic checks to include _HEURISTIC_ so it is
easier to know the kind of error/failure it is.
When doing a shared-library build with cmake we were not emitting
the correct .so version names nor setting SONAME with the
correct version name. This long-standing mistake is now fixed.
<b>Changes 0.9.1 to 0.9.2</b>
Version 0.9.2 released 2 April 2024
Vulnerabilities DW202402-001, DW202402-002,DW202402-003,
and DW202403-001 could crash @e libdwarf given
a carefully corrupted (fuzzed) DWARF object file.
Now the library returns an error
for these corruptions.
DW_CFA_high_user (in dwarf.h) was a misspelling.
Added the correct spelling DW_CFA_hi_user and
a comment on the incorrect spelling.
<b>Changes 0.9.0 to 0.9.1</b>
Version 0.9.1 released 27 January 2024
The abbreviation code type returned by
dwarf_die_abbrev_code() changed from <b>int</b>
to <b>Dwarf_Unsigned</b> as abbrev codes are
not constrained by the DWARF Standard.
The section count returned by dwarf_get_section_count()
is now of type <b>Dwarf_Unsigned</b>. The previous type
of <b>int</b> never made sense in @e libdwarf.
Callers will, in practice, see the same value as before.
All type-warnings issued by MSVC have been fixed.
Problems reading Macho (Apple) relocatable
object files have been fixed.
Each of the build systems available now has an option
which eliminates @e libdwarf references to the
object section decompression libraries.
See the respective READMEs.
<b>Changes 0.8.0 to 0.9.0</b>
Version 0.9.0 released 8 December 2023
Adding functions (rarely needed) for callers
with special requirements.
Added dwarf_get_section_info_by_name_a() and
dwarf_get_section_info_by_index_a() which add
dw_section_flags pointer argument to return
the object section file flags (whose meaning
depends entirely on the object file format),
and dw_section_offset pointer argument to return
the object-relevant offset of the section
(here too the meaning depends on the object format).
Also added dwarf_machine_architecture() which returns
a few top level data items about the object
@e libdwarf has opened, including the 'machine' and 'flags'
from object headers (all supported object types).
This adds new library functions
dwarf_next_cu_header_e()
and dwarf_siblingof_c().
Used exactly as documented dwarf_next_cu_header_d()
and dwarf_siblingof_b() work fine and continue to
be supported for the forseeable future. However
it would be easy to misuse as the requirement that
dwarf_siblingof_b() be called immediately after
a successful call to dwarf_next_cu_header_d()
was never stated and that dependency was impossible
to enforce. The dependency was an API mistake
made in 1992.
So dwarf_next_cu_header_e() now returns the
compilation-unit DIE as well as header
data and dwarf_siblingof_c() is not needed
except to traverse sibling DIEs.
(the compilation-unit DIE by definition has no siblings).
Changes were required to support Mach-O (Apple)
universal binaries,
which were not readable by earlier versions of the library.
We have new library functions
dwarf_init_path_a(),
dwarf_init_path_dl_a(), and
dwarf_get_universalbinary_count().
The first two allow a caller to specify which
(numbering from zero) object file to
report on by adding a new argument dw_universalnumber.
Passing zero as the dw_universalnumber argument
is always safe.
libdwarf-code-0.11.1/doc/libdwarf.dox view on Meta::CPAN
as a default since there is no dw_universalnumber
argument possible.
For improved performance in reading Fde data
when iterating though all usable pc values
we add dwarf_get_fde_info_for_all_regs3_b(), which
returns the next pc value with actual frame data.
We retain dwarf_get_fde_info_for_all_regs3() so
existing code need not change.
<b>Changes 0.7.0 to 0.8.0</b>
v0.8.0 released 2023-09-20
New functions dwarf_get_fde_info_for_reg3_c(),
dwarf_get_fde_info_for_cfa_reg3_c() are defined.
The advantage of the new versions is they correctly
type the dw_offset argument return value
as Dwarf_Signed instead of the earlier and incorrect type
Dwarf_Unsigned.
The original functions dwarf_get_fde_info_for_reg3_b() and
dwarf_get_fde_info_for_cfa_reg3_b()
continue to exist and work for compatibility with
the previous release.
For all open() calls for which the O_CLOEXEC flag exists
we now add that flag to the open() call.
Vulnerabilities involving reading
corrupt object files (created by fuzzing)
have been fixed:
DW202308-001 (ossfuzz 59576),
DW202307-001 (ossfuzz 60506),
DW202306-011 (ossfuzz 59950),
DW202306-009 (ossfuzz 59755),
DW202306-006 (ossfuzz 59727),
DW202306-005 (ossfuzz 59717),
DW202306-004 (ossfuzz 59695),
DW202306-002 (ossfuzz 59519),
DW202306-001 (ossfuzz 59597).
DW202305-010 (ossfuzz 59478).
DW202305-009 (ossfuzz 56451).
DW202305-008 (ossfuzz 56451),
DW202305-007 (ossfuzz 56474),
DW202305-006 (ossfuzz 56472),
DW202305-005 (ossfuzz 56462),
DW202305-004 (ossfuzz 56446).
<b>Changes 0.6.0 to 0.7.0</b>
v0.7.0 released 2023-05-20
Elf section counts can exceed 16 bits
(on linux see <b>man 5 elf</b>)
so some function prototype members
of struct <b>Dwarf_Obj_Access_Methods_a_s</b>
changed.
Specifically, om_get_section_info()
om_load_section(), and
om_relocate_a_section()
now pass section indexes as Dwarf_Unsigned
instead of Dwarf_Half.
Without this change executables/objects
with more than 64K sections cannot
be read by @e libdwarf. This is unlikely
to affect your code since for most users
@e libdwarf takes care of this and dwarfdump
is aware of this change.
Two functions have been removed from libdwarf.h
and the library: dwarf_dnames_abbrev_by_code()
and dwarf_dnames_abbrev_form_by_index().
dwarf_dnames_abbrev_by_code() is slow and pointless.
Use either dwarf_dnames_name() or
dwarf_dnames_abbrevtable() instead, depending
on what you want to accomplish.
dwarf_dnames_abbrev_form_by_index() is not needed,
was difficult to call due to argument list
requirements, and never worked.
<b>Changes 0.5.0 to 0.6.0</b>
v0.6.0 released 2023-02-20
The dealloc required by dwarf_offset_list()
was wrong. The call could crash @e libdwarf
on systems with 32bit pointers.
The new and proper dealloc (for all
pointer sizes) is
dwarf_dealloc(dbg,offsetlistptr,DW_DLA_UARRAY);
A memory leak from dwarf_load_loclists()
and dwarf_load_rnglists() is fixed and the
libdwarf-regressiontests error that hid the leak
has also been fixed.
A <b>compatibility</b> change affects callers of
dwarf_dietype_offset(), which on success returns
the offset of the target of the DW_AT_type attribute
(if such exists in the Dwarf_Die). Added a pointer
argument so the function can (when
appropriate) return a FALSE argument
indicating the offset refers to DWARF4 .debug_types
section, rather than TRUE value when .debug_info
is the section the offset refers to.
If anyone was using this function it would fail
badly (while pretending success)
with a DWARF4 DW_FORM_ref_sig8 on a DW_AT_type
attribute from the Dwarf_Die argument. One will likely
encounter DWARF4 content so a single correct function
seemed necessary. New regression tests will ensure
this will continue to work.
A <b>compatibility</b> change affects callers of
dwarf_get_pubtypes(). If an application reads
.debug_pubtypes there is a <b>compatibility
break</b>. Such applications must be recompiled
with latest @e libdwarf, change Dwarf_Type
declarations to use Dwarf_Global, and can only
( run in 0.611 second using v1.01-cache-2.11-cpan-5511b514fd6 )