MFV r323105 (partial): 8300 fix man page issues found by mandoc 1.14.1
illumos/illumos-gate@72d3dbb9ab
72d3dbb9ab
https://www.illumos.org/issues/8300
Prior to integrating the mdocml update to 1.14.1, fix issues found by
new version, especially the "new sentence, new line" style rule.
FreeBSD note: this revision merges only the changes to the CTF manual
page. The changes to the ZFS pages cannot be applied directly.
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
Discussed with: avg
MFC after: 2 weeks
This commit is contained in:
parent
6b028c8c22
commit
016faf2a2a
@ -39,7 +39,8 @@ data contained in each file has information about the layout and
|
||||
sizes of C types, including intrinsic types, enumerations, structures,
|
||||
typedefs, and unions, that are used by the corresponding
|
||||
.Sy ELF
|
||||
object. The
|
||||
object.
|
||||
The
|
||||
.Nm
|
||||
data may also include information about the types of global objects and
|
||||
the return type and arguments of functions in the symbol table.
|
||||
@ -53,11 +54,11 @@ file itself, it may also be referred to as a
|
||||
.Lp
|
||||
On illumos systems,
|
||||
.Nm
|
||||
data is consumed by multiple programs. It can be used by the modular
|
||||
debugger,
|
||||
data is consumed by multiple programs.
|
||||
It can be used by the modular debugger,
|
||||
.Xr mdb 1 ,
|
||||
as well as by
|
||||
.Xr dtrace 1M .
|
||||
.Xr dtrace 1 .
|
||||
Programmatic access to
|
||||
.Nm
|
||||
data can be obtained through
|
||||
@ -65,8 +66,8 @@ data can be obtained through
|
||||
.Lp
|
||||
The
|
||||
.Nm
|
||||
file format is broken down into seven different sections. The first
|
||||
section is the
|
||||
file format is broken down into seven different sections.
|
||||
The first section is the
|
||||
.Sy preamble
|
||||
and
|
||||
.Sy header ,
|
||||
@ -74,18 +75,22 @@ which describes the version of the
|
||||
.Nm
|
||||
file, links it has to other
|
||||
.Nm
|
||||
files, and the sizes of the other sections. The next section is the
|
||||
files, and the sizes of the other sections.
|
||||
The next section is the
|
||||
.Sy label
|
||||
section,
|
||||
which provides a way of identifying similar groups of
|
||||
.Nm
|
||||
data across multiple files. This is followed by the
|
||||
data across multiple files.
|
||||
This is followed by the
|
||||
.Sy object
|
||||
information section, which describes the type of global
|
||||
symbols. The subsequent section is the
|
||||
symbols.
|
||||
The subsequent section is the
|
||||
.Sy function
|
||||
information section, which describes the return
|
||||
types and arguments of functions. The next section is the
|
||||
types and arguments of functions.
|
||||
The next section is the
|
||||
.Sy type
|
||||
information section, which describes
|
||||
the format and layout of the C types themselves, and finally the last
|
||||
@ -106,29 +111,33 @@ A
|
||||
file may contain all of the type information that it requires, or it
|
||||
may optionally refer to another
|
||||
.Nm
|
||||
file which holds the remaining types. When a
|
||||
file which holds the remaining types.
|
||||
When a
|
||||
.Nm
|
||||
file refers to another file, it is called the
|
||||
.Sy child
|
||||
and the file it refers to is called the
|
||||
.Sy parent .
|
||||
A given file may only refer to one parent. This process is called
|
||||
A given file may only refer to one parent.
|
||||
This process is called
|
||||
.Em uniquification
|
||||
because it ensures each child only has type information that is
|
||||
unique to it. A common example of this is that most kernel modules in
|
||||
illumos are uniquified against the kernel module
|
||||
unique to it.
|
||||
A common example of this is that most kernel modules in illumos are uniquified
|
||||
against the kernel module
|
||||
.Sy genunix
|
||||
and the type information that comes from the
|
||||
.Sy IP
|
||||
module. This means that a module only has types that are unique to
|
||||
itself and the most common types in the kernel are not duplicated.
|
||||
module.
|
||||
This means that a module only has types that are unique to itself and the most
|
||||
common types in the kernel are not duplicated.
|
||||
.Sh FILE FORMAT
|
||||
This documents version
|
||||
.Em two
|
||||
of the
|
||||
.Nm
|
||||
file format. All applications and tools currently produce and operate on
|
||||
this version.
|
||||
file format.
|
||||
All applications and tools currently produce and operate on this version.
|
||||
.Lp
|
||||
The file format can be summarized with the following image, the
|
||||
following sections will cover this in more detail.
|
||||
@ -235,26 +244,31 @@ This
|
||||
.Sy preamble
|
||||
defines the version of the
|
||||
.Nm
|
||||
file which defines the format of the rest of the header. While the
|
||||
header may change in subsequent versions, the preamble will not change
|
||||
file which defines the format of the rest of the header.
|
||||
While the header may change in subsequent versions, the preamble will not change
|
||||
across versions, though the interpretation of its flags may change from
|
||||
version to version. The
|
||||
version to version.
|
||||
The
|
||||
.Em ctp_magic
|
||||
member defines the magic number for the
|
||||
.Nm
|
||||
file format. This must always be
|
||||
file format.
|
||||
This must always be
|
||||
.Li 0xcff1 .
|
||||
If another value is encountered, then the file should not be treated as
|
||||
a
|
||||
.Nm
|
||||
file. The
|
||||
file.
|
||||
The
|
||||
.Em ctp_version
|
||||
member defines the version of the
|
||||
.Nm
|
||||
file. The current version is
|
||||
file.
|
||||
The current version is
|
||||
.Li 2 .
|
||||
It is possible to encounter an unsupported version. In that case,
|
||||
software should not try to parse the format, as it may have changed.
|
||||
It is possible to encounter an unsupported version.
|
||||
In that case, software should not try to parse the format, as it may have
|
||||
changed.
|
||||
Finally, the
|
||||
.Em ctp_flags
|
||||
member describes aspects of the file which modify its interpretation.
|
||||
@ -273,9 +287,10 @@ has been compressed through the
|
||||
.Sy zlib
|
||||
library and its
|
||||
.Sy deflate
|
||||
algorithm. If this flag is not present, then the body has not been
|
||||
compressed and no special action is needed to interpret it. All offsets
|
||||
into the data as described by
|
||||
algorithm.
|
||||
If this flag is not present, then the body has not been compressed and no
|
||||
special action is needed to interpret it.
|
||||
All offsets into the data as described by
|
||||
.Sy header ,
|
||||
always refer to the
|
||||
.Sy uncompressed
|
||||
@ -289,8 +304,8 @@ denotes whether whether or not this
|
||||
.Nm
|
||||
file is the child of another
|
||||
.Nm
|
||||
file and also indicates the size of the remaining sections. The
|
||||
structure for the
|
||||
file and also indicates the size of the remaining sections.
|
||||
The structure for the
|
||||
.Sy header ,
|
||||
logically contains a copy of the
|
||||
.Sy preamble
|
||||
@ -315,37 +330,40 @@ the next two members
|
||||
.Em cth_parlablel
|
||||
and
|
||||
.Em cth_parname ,
|
||||
are used to identify the parent. The value of both members are offsets
|
||||
into the
|
||||
are used to identify the parent.
|
||||
The value of both members are offsets into the
|
||||
.Sy string
|
||||
section which point to the start of a null-terminated string. For more
|
||||
information on the encoding of strings, see the subsection on
|
||||
section which point to the start of a null-terminated string.
|
||||
For more information on the encoding of strings, see the subsection on
|
||||
.Sx String Identifiers .
|
||||
If the value of either is zero, then there is no entry for that
|
||||
member. If the member
|
||||
member.
|
||||
If the member
|
||||
.Em cth_parlabel
|
||||
is set, then the
|
||||
.Em ctf_parname
|
||||
member must be set, otherwise it will not be possible to find the
|
||||
parent. If
|
||||
parent.
|
||||
If
|
||||
.Em ctf_parname
|
||||
is set, it is not necessary to define
|
||||
.Em cth_parlabel ,
|
||||
as the parent may not have a label. For more information on labels
|
||||
and their interpretation, see
|
||||
as the parent may not have a label.
|
||||
For more information on labels and their interpretation, see
|
||||
.Sx The Label Section .
|
||||
.Lp
|
||||
The remaining members (excepting
|
||||
.Em cth_strlen )
|
||||
describe the beginning of the corresponding sections. These offsets are
|
||||
relative to the end of the
|
||||
describe the beginning of the corresponding sections.
|
||||
These offsets are relative to the end of the
|
||||
.Sy header .
|
||||
Therefore, something with an offset of 0 is at an offset of thirty-six
|
||||
bytes relative to the start of the
|
||||
.Nm
|
||||
file. The difference between members
|
||||
indicates the size of the section itself. Different offsets have
|
||||
different alignment requirements. The start of the
|
||||
file.
|
||||
The difference between members indicates the size of the section itself.
|
||||
Different offsets have different alignment requirements.
|
||||
The start of the
|
||||
.Em cth_objotoff
|
||||
and
|
||||
.Em cth_funcoff
|
||||
@ -353,13 +371,14 @@ must be two byte aligned, while the sections
|
||||
.Em cth_lbloff
|
||||
and
|
||||
.Em cth_typeoff
|
||||
must be four-byte aligned. The section
|
||||
must be four-byte aligned.
|
||||
The section
|
||||
.Em cth_stroff
|
||||
has no alignment requirements. To calculate the size of a given section,
|
||||
excepting the
|
||||
has no alignment requirements.
|
||||
To calculate the size of a given section, excepting the
|
||||
.Sy string
|
||||
section, one should subtract the offset of the section from the following one. For
|
||||
example, the size of the
|
||||
section, one should subtract the offset of the section from the following one.
|
||||
For example, the size of the
|
||||
.Sy types
|
||||
section can be calculated by subtracting
|
||||
.Em cth_stroff
|
||||
@ -368,8 +387,8 @@ from
|
||||
.Lp
|
||||
Finally, the member
|
||||
.Em cth_strlen
|
||||
describes the length of the string section itself. From it, you can also
|
||||
calculate the size of the entire
|
||||
describes the length of the string section itself.
|
||||
From it, you can also calculate the size of the entire
|
||||
.Nm
|
||||
file by adding together the size of the
|
||||
.Sy ctf_header_t ,
|
||||
@ -380,9 +399,11 @@ and the size of the string section in
|
||||
.Ss Type Identifiers
|
||||
Through the
|
||||
.Nm ctf
|
||||
data, types are referred to by identifiers. A given
|
||||
data, types are referred to by identifiers.
|
||||
A given
|
||||
.Nm
|
||||
file supports up to 32767 (0x7fff) types. The first valid type identifier is 0x1.
|
||||
file supports up to 32767 (0x7fff) types.
|
||||
The first valid type identifier is 0x1.
|
||||
When a given
|
||||
.Nm
|
||||
file is a child, indicated by a non-zero entry for the
|
||||
@ -403,18 +424,20 @@ Other consumers of
|
||||
information may use larger or opaque identifiers.
|
||||
.Ss String Identifiers
|
||||
String identifiers are always encoded as four byte unsigned integers
|
||||
which are an offset into a string table. The
|
||||
which are an offset into a string table.
|
||||
The
|
||||
.Nm
|
||||
format supports two different string tables which have an identifier of
|
||||
zero or one. This identifier is stored in the high-order bit of the
|
||||
unsigned four byte offset. Therefore, the maximum supported offset into
|
||||
one of these tables is 0x7ffffffff.
|
||||
zero or one.
|
||||
This identifier is stored in the high-order bit of the unsigned four byte
|
||||
offset.
|
||||
Therefore, the maximum supported offset into one of these tables is 0x7ffffffff.
|
||||
.Lp
|
||||
Table identifier zero, always refers to the
|
||||
.Sy string
|
||||
section in the CTF file itself. String table identifier one refers to an
|
||||
external string table which is the ELF string table for the ELF symbol
|
||||
table associated with the
|
||||
section in the CTF file itself.
|
||||
String table identifier one refers to an external string table which is the ELF
|
||||
string table for the ELF symbol table associated with the
|
||||
.Nm
|
||||
container.
|
||||
.Ss Type Encoding
|
||||
@ -434,8 +457,8 @@ The length of the variable data
|
||||
.Lp
|
||||
The 16 bits that make up the encoding are broken down such that you have
|
||||
five bits for the kind, one bit for indicating whether or not it is a
|
||||
root type, and 10 bits for the variable length. This is laid out as
|
||||
follows:
|
||||
root type, and 10 bits for the variable length.
|
||||
This is laid out as follows:
|
||||
.Bd -literal -offset indent
|
||||
+--------------------+
|
||||
| kind | root | vlen |
|
||||
@ -443,12 +466,13 @@ follows:
|
||||
15 11 10 9 0
|
||||
.Ed
|
||||
.Lp
|
||||
The current version of the file format defines 14 different kinds. The
|
||||
interpretation of these different kinds will be discussed in the section
|
||||
The current version of the file format defines 14 different kinds.
|
||||
The interpretation of these different kinds will be discussed in the section
|
||||
.Sx The Type Section .
|
||||
If a kind is encountered that is not listed below, then it is not a valid
|
||||
.Nm
|
||||
file. The kinds are defined as follows:
|
||||
file.
|
||||
The kinds are defined as follows:
|
||||
.Bd -literal -offset indent
|
||||
#define CTF_K_UNKNOWN 0
|
||||
#define CTF_K_INTEGER 1
|
||||
@ -467,14 +491,16 @@ file. The kinds are defined as follows:
|
||||
.Ed
|
||||
.Lp
|
||||
Programs directly reference many types; however, other types are referenced
|
||||
indirectly because they are part of some other structure. These types that are
|
||||
referenced directly and used are called
|
||||
indirectly because they are part of some other structure.
|
||||
These types that are referenced directly and used are called
|
||||
.Sy root
|
||||
types. Other types may be used indirectly, for example, a program may reference
|
||||
a structure directly, but not one of its members which has a type. That type is
|
||||
not considered a
|
||||
types.
|
||||
Other types may be used indirectly, for example, a program may reference
|
||||
a structure directly, but not one of its members which has a type.
|
||||
That type is not considered a
|
||||
.Sy root
|
||||
type. If a type is a
|
||||
type.
|
||||
If a type is a
|
||||
.Sy root
|
||||
type, then it will have bit 10 set.
|
||||
.Lp
|
||||
@ -499,16 +525,17 @@ When consuming
|
||||
.Nm
|
||||
data, it is often useful to know whether two different
|
||||
.Nm
|
||||
containers come from the same source base and version. For example, when
|
||||
building illumos, there are many kernel modules that are built against a
|
||||
single collection of source code. A label is encoded into the
|
||||
containers come from the same source base and version.
|
||||
For example, when building illumos, there are many kernel modules that are built
|
||||
against a single collection of source code.
|
||||
A label is encoded into the
|
||||
.Nm
|
||||
files that corresponds with the particular build. This ensures that if
|
||||
files on the system were to become mixed up from multiple releases, that
|
||||
they are not used together by tools, particularly when a child needs to
|
||||
refer to a type in the parent. Because they are linked used the type
|
||||
identifiers, if the wrong parent is used then the wrong type will be
|
||||
encountered.
|
||||
files that corresponds with the particular build.
|
||||
This ensures that if files on the system were to become mixed up from multiple
|
||||
releases, that they are not used together by tools, particularly when a child
|
||||
needs to refer to a type in the parent.
|
||||
Because they are linked used the type identifiers, if the wrong parent is used
|
||||
then the wrong type will be encountered.
|
||||
.Lp
|
||||
Each label is encoded in the file format using the following eight byte
|
||||
structure:
|
||||
@ -530,21 +557,22 @@ section.
|
||||
The type identifier encoded in the member
|
||||
.Em ctl_typeidx
|
||||
refers to the last type identifier that a label refers to in the current
|
||||
file. Labels only refer to types in the current file, if the
|
||||
file.
|
||||
Labels only refer to types in the current file, if the
|
||||
.Nm
|
||||
file is a child, then it will have the same label as its parent;
|
||||
however, its label will only refer to its types, not its parents.
|
||||
.Lp
|
||||
It is also possible, though rather uncommon, for a
|
||||
.Nm
|
||||
file to have multiple labels. Labels are placed one after another, every
|
||||
eight bytes. When multiple labels are present, types may only belong to
|
||||
a single label.
|
||||
file to have multiple labels.
|
||||
Labels are placed one after another, every eight bytes.
|
||||
When multiple labels are present, types may only belong to a single label.
|
||||
.Ss The Object Section
|
||||
The object section provides a mapping from ELF symbols of type
|
||||
.Sy STT_OBJECT
|
||||
in the symbol table to a type identifier. Every entry in this section is
|
||||
a
|
||||
in the symbol table to a type identifier.
|
||||
Every entry in this section is a
|
||||
.Sy uint16_t
|
||||
which contains a type identifier as described in the section
|
||||
.Sx Type Identifiers .
|
||||
@ -555,9 +583,10 @@ To walk the object section, you need to have a corresponding
|
||||
.Sy symbol table
|
||||
in the ELF object that contains the
|
||||
.Nm
|
||||
data. Not every object is included in this section. Specifically, when
|
||||
walking the symbol table. An entry is skipped if it matches any of the
|
||||
following conditions:
|
||||
data.
|
||||
Not every object is included in this section.
|
||||
Specifically, when walking the symbol table.
|
||||
An entry is skipped if it matches any of the following conditions:
|
||||
.Lp
|
||||
.Bl -bullet -offset indent -compact
|
||||
.It
|
||||
@ -628,40 +657,45 @@ walk_symbols(uint16_t *objtoff, Elf_Data *symdata, Elf_Data *strdata,
|
||||
The function section of the
|
||||
.Nm
|
||||
file encodes the types of both the function's arguments and the function's
|
||||
return type. Similar to
|
||||
return type.
|
||||
Similar to
|
||||
.Sx The Object Section ,
|
||||
the function section encodes information for all symbols of type
|
||||
.Sy STT_FUNCTION ,
|
||||
excepting those that fit specific criteria. Unlike with objects, because
|
||||
functions have a variable number of arguments, they start with a type encoding
|
||||
as defined in
|
||||
excepting those that fit specific criteria.
|
||||
Unlike with objects, because functions have a variable number of arguments, they
|
||||
start with a type encoding as defined in
|
||||
.Sx Type Encoding ,
|
||||
which is the size of a
|
||||
.Sy uint16_t .
|
||||
For functions which have no type information available, they are encoded as
|
||||
.Li CTF_TYPE_INFO(CTF_K_UNKNOWN, 0, 0) .
|
||||
Functions with arguments are encoded differently. Here, the variable length is
|
||||
turned into the number of arguments in the function. If a function is a
|
||||
Functions with arguments are encoded differently.
|
||||
Here, the variable length is turned into the number of arguments in the
|
||||
function.
|
||||
If a function is a
|
||||
.Sy varargs
|
||||
type function, then the number of arguments is increased by one. Functions with
|
||||
type information are encoded as:
|
||||
type function, then the number of arguments is increased by one.
|
||||
Functions with type information are encoded as:
|
||||
.Li CTF_TYPE_INFO(CTF_K_FUNCTION, 0, nargs) .
|
||||
.Lp
|
||||
For functions that have no type information, nothing else is encoded, and the
|
||||
next function is encoded. For functions with type information, the next
|
||||
next function is encoded.
|
||||
For functions with type information, the next
|
||||
.Sy uint16_t
|
||||
is encoded with the type identifier of the return type of the function. It is
|
||||
followed by each of the type identifiers of the arguments, if any exist, in the
|
||||
order that they appear in the function. Therefore, argument 0 is the first type
|
||||
identifier and so on. When a function has a final varargs argument, that is
|
||||
encoded with the type identifier of zero.
|
||||
is encoded with the type identifier of the return type of the function.
|
||||
It is followed by each of the type identifiers of the arguments, if any exist,
|
||||
in the order that they appear in the function.
|
||||
Therefore, argument 0 is the first type identifier and so on.
|
||||
When a function has a final varargs argument, that is encoded with the type
|
||||
identifier of zero.
|
||||
.Lp
|
||||
Like
|
||||
.Sx The Object Section ,
|
||||
the function section is encoded in the order of the symbol table. It has
|
||||
similar, but slightly different considerations from objects. While iterating the
|
||||
symbol table, if any of the following conditions are true, then the entry is
|
||||
skipped and no corresponding entry is written:
|
||||
the function section is encoded in the order of the symbol table.
|
||||
It has similar, but slightly different considerations from objects.
|
||||
While iterating the symbol table, if any of the following conditions are true,
|
||||
then the entry is skipped and no corresponding entry is written:
|
||||
.Lp
|
||||
.Bl -bullet -offset indent -compact
|
||||
.It
|
||||
@ -683,10 +717,11 @@ ELF.
|
||||
.Ss The Type Section
|
||||
The type section is the heart of the
|
||||
.Nm
|
||||
data. It encodes all of the information about the types themselves. The base of
|
||||
the type information comes in two forms, a short form and a long form, each of
|
||||
which may be followed by a variable number of arguments. The following
|
||||
definitions describe the short and long forms:
|
||||
data.
|
||||
It encodes all of the information about the types themselves.
|
||||
The base of the type information comes in two forms, a short form and a long
|
||||
form, each of which may be followed by a variable number of arguments.
|
||||
The following definitions describe the short and long forms:
|
||||
.Bd -literal
|
||||
#define CTF_MAX_SIZE 0xfffe /* max size of a type in bytes */
|
||||
#define CTF_LSIZE_SENT 0xffff /* sentinel for ctt_size */
|
||||
@ -720,14 +755,17 @@ Type sizes are stored in
|
||||
.Sy bytes .
|
||||
The basic small form uses a
|
||||
.Sy ushort_t
|
||||
to store the number of bytes. If the number of bytes in a structure would exceed
|
||||
0xfffe, then the alternate form, the
|
||||
to store the number of bytes.
|
||||
If the number of bytes in a structure would exceed 0xfffe, then the alternate
|
||||
form, the
|
||||
.Sy ctf_type_t ,
|
||||
is used instead. To indicate that the larger form is being used, the member
|
||||
is used instead.
|
||||
To indicate that the larger form is being used, the member
|
||||
.Em ctt_size
|
||||
is set to value of
|
||||
.Sy CTF_LSIZE_SENT
|
||||
(0xffff). In general, when going through the type section, consumers use the
|
||||
(0xffff).
|
||||
In general, when going through the type section, consumers use the
|
||||
.Sy ctf_type_t
|
||||
structure, but pay attention to the value of the member
|
||||
.Em ctt_size
|
||||
@ -739,17 +777,21 @@ Not all kinds of types use
|
||||
.Sy ctt_size .
|
||||
Those which do not, will always use the
|
||||
.Sy ctf_stype_t
|
||||
structure. The individual sections for each kind have more information.
|
||||
structure.
|
||||
The individual sections for each kind have more information.
|
||||
.Lp
|
||||
Types are written out in order. Therefore the first entry encountered has a type
|
||||
id of 0x1, or 0x8000 if a child. The member
|
||||
Types are written out in order.
|
||||
Therefore the first entry encountered has a type id of 0x1, or 0x8000 if a
|
||||
child.
|
||||
The member
|
||||
.Em ctt_name
|
||||
is encoded as described in the section
|
||||
.Sx String Identifiers .
|
||||
The string that it points to is the name of the type. If the identifier points
|
||||
to an empty string (one that consists solely of a null terminator) then the type
|
||||
does not have a name, this is common with anonymous structures and unions that
|
||||
only have a typedef to name them, as well as, pointers and qualifiers.
|
||||
The string that it points to is the name of the type.
|
||||
If the identifier points to an empty string (one that consists solely of a null
|
||||
terminator) then the type does not have a name, this is common with anonymous
|
||||
structures and unions that only have a typedef to name them, as well as,
|
||||
pointers and qualifiers.
|
||||
.Lp
|
||||
The next member, the
|
||||
.Em ctt_info ,
|
||||
@ -757,18 +799,21 @@ is encoded as described in the section
|
||||
.Sx Type Encoding .
|
||||
The types kind tells us how to interpret the remaining data in the
|
||||
.Sy ctf_type_t
|
||||
and any variable length data that may exist. The rest of this section will be
|
||||
broken down into the interpretation of the various kinds.
|
||||
and any variable length data that may exist.
|
||||
The rest of this section will be broken down into the interpretation of the
|
||||
various kinds.
|
||||
.Ss Encoding of Integers
|
||||
Integers, which are of type
|
||||
.Sy CTF_K_INTEGER ,
|
||||
have no variable length arguments. Instead, they are followed by a four byte
|
||||
have no variable length arguments.
|
||||
Instead, they are followed by a four byte
|
||||
.Sy uint_t
|
||||
which describes their encoding. All integers must be encoded with a variable
|
||||
length of zero. The
|
||||
which describes their encoding.
|
||||
All integers must be encoded with a variable length of zero.
|
||||
The
|
||||
.Em ctt_size
|
||||
member describes the length of the integer in bytes. In general, integer sizes
|
||||
will be rounded up to the closest power of two.
|
||||
member describes the length of the integer in bytes.
|
||||
In general, integer sizes will be rounded up to the closest power of two.
|
||||
.Lp
|
||||
The integer encoding contains three different pieces of information:
|
||||
.Bl -bullet -offset indent -compact
|
||||
@ -804,33 +849,37 @@ The following flags are defined for the encoding at this time:
|
||||
.Lp
|
||||
By default, an integer is considered to be unsigned, unless it has the
|
||||
.Sy CTF_INT_SIGNED
|
||||
flag set. If the flag
|
||||
flag set.
|
||||
If the flag
|
||||
.Sy CTF_INT_CHAR
|
||||
is set, that indicates that the integer is of a type that stores character
|
||||
data, for example the intrinsic C type
|
||||
.Sy char
|
||||
would have the
|
||||
.Sy CTF_INT_CHAR
|
||||
flag set. If the flag
|
||||
flag set.
|
||||
If the flag
|
||||
.Sy CTF_INT_BOOL
|
||||
is set, that indicates that the integer represents a boolean type. For example,
|
||||
the intrinsic C type
|
||||
is set, that indicates that the integer represents a boolean type.
|
||||
For example, the intrinsic C type
|
||||
.Sy _Bool
|
||||
would have the
|
||||
.Sy CTF_INT_BOOL
|
||||
flag set. Finally, the flag
|
||||
flag set.
|
||||
Finally, the flag
|
||||
.Sy CTF_INT_VARARGS
|
||||
indicates that the integer is used as part of a variable number of arguments.
|
||||
This encoding is rather uncommon.
|
||||
.Ss Encoding of Floats
|
||||
Floats, which are of type
|
||||
.Sy CTF_K_FLOAT ,
|
||||
are similar to their integer counterparts. They have no variable length
|
||||
arguments and are followed by a four byte encoding which describes the kind of
|
||||
float that exists. The
|
||||
are similar to their integer counterparts.
|
||||
They have no variable length arguments and are followed by a four byte encoding
|
||||
which describes the kind of float that exists.
|
||||
The
|
||||
.Em ctt_size
|
||||
member is the size, in bytes, of the float. The float encoding has three
|
||||
different pieces of information inside of it:
|
||||
member is the size, in bytes, of the float.
|
||||
The float encoding has three different pieces of information inside of it:
|
||||
.Lp
|
||||
.Bl -bullet -offset indent -compact
|
||||
.It
|
||||
@ -856,10 +905,11 @@ This encoding can be expressed through the following macros:
|
||||
.Ed
|
||||
.Lp
|
||||
Where as the encoding for integers was a series of flags, the encoding for
|
||||
floats maps to a specific kind of float. It is not a flag-based value. The kinds of floats
|
||||
correspond to both their size, and the encoding. This covers all of the basic C
|
||||
intrinsic floating point types. The following are the different kinds of floats
|
||||
represented in the encoding:
|
||||
floats maps to a specific kind of float.
|
||||
It is not a flag-based value.
|
||||
The kinds of floats correspond to both their size, and the encoding.
|
||||
This covers all of the basic C intrinsic floating point types.
|
||||
The following are the different kinds of floats represented in the encoding:
|
||||
.Bd -literal -offset indent
|
||||
#define CTF_FP_SINGLE 1 /* IEEE 32-bit float encoding */
|
||||
#define CTF_FP_DOUBLE 2 /* IEEE 64-bit float encoding */
|
||||
@ -877,12 +927,14 @@ represented in the encoding:
|
||||
.Ss Encoding of Arrays
|
||||
Arrays, which are of type
|
||||
.Sy CTF_K_ARRAY ,
|
||||
have no variable length arguments. They are followed by a structure which
|
||||
describes the number of elements in the array, the type identifier of the
|
||||
elements in the array, and the type identifier of the index of the array. With
|
||||
arrays, the
|
||||
have no variable length arguments.
|
||||
They are followed by a structure which describes the number of elements in the
|
||||
array, the type identifier of the elements in the array, and the type identifier
|
||||
of the index of the array.
|
||||
With arrays, the
|
||||
.Em ctt_size
|
||||
member is set to zero. The structure that follows an array is defined as:
|
||||
member is set to zero.
|
||||
The structure that follows an array is defined as:
|
||||
.Bd -literal
|
||||
typedef struct ctf_array {
|
||||
ushort_t cta_contents; /* reference to type of array contents */
|
||||
@ -901,14 +953,15 @@ are type identifiers which are encoded as per the section
|
||||
.Sx Type Identifiers .
|
||||
The member
|
||||
.Em cta_nelems
|
||||
is a simple four byte unsigned count of the number of elements. This count may
|
||||
be zero when encountering C99's flexible array members.
|
||||
is a simple four byte unsigned count of the number of elements.
|
||||
This count may be zero when encountering C99's flexible array members.
|
||||
.Ss Encoding of Functions
|
||||
Function types, which are of type
|
||||
.Sy CTF_K_FUNCTION ,
|
||||
use the variable length list to be the number of arguments in the function. When
|
||||
the function has a final member which is a varargs, then the argument count is
|
||||
incremented by one to account for the variable argument. Here, the
|
||||
use the variable length list to be the number of arguments in the function.
|
||||
When the function has a final member which is a varargs, then the argument count
|
||||
is incremented by one to account for the variable argument.
|
||||
Here, the
|
||||
.Em ctt_type
|
||||
member is encoded with the type identifier of the return type of the function.
|
||||
Note that the
|
||||
@ -916,31 +969,36 @@ Note that the
|
||||
member is not used here.
|
||||
.Lp
|
||||
The variable argument list contains the type identifiers for the arguments of
|
||||
the function, if any. Each one is represented by a
|
||||
the function, if any.
|
||||
Each one is represented by a
|
||||
.Sy uint16_t
|
||||
and encoded according to the
|
||||
.Sx Type Identifiers
|
||||
section. If the function's last argument is of type varargs, then it is also
|
||||
written out, but the type identifier is zero. This is included in the count of
|
||||
the function's arguments.
|
||||
section.
|
||||
If the function's last argument is of type varargs, then it is also written out,
|
||||
but the type identifier is zero.
|
||||
This is included in the count of the function's arguments.
|
||||
.Ss Encoding of Structures and Unions
|
||||
Structures and Unions, which are encoded with
|
||||
.Sy CTF_K_STRUCT
|
||||
and
|
||||
.Sy CTF_K_UNION
|
||||
respectively, are very similar constructs in C. The main difference
|
||||
between them is the fact that every member of a structure follows one another,
|
||||
where as in a union, all members share the same memory. They are also very
|
||||
similar in terms of their encoding in
|
||||
respectively, are very similar constructs in C.
|
||||
The main difference between them is the fact that every member of a structure
|
||||
follows one another, where as in a union, all members share the same memory.
|
||||
They are also very similar in terms of their encoding in
|
||||
.Nm .
|
||||
The variable length argument for structures and unions represents the number of
|
||||
members that they have. The value of the member
|
||||
members that they have.
|
||||
The value of the member
|
||||
.Em ctt_size
|
||||
is the size of the structure and union. There are two different structures which
|
||||
are used to encode members in the variable list. When the size of a structure or
|
||||
union is greater than or equal to the large member threshold, 8192, then a
|
||||
different structure is used to encode the member, all members are encoded using
|
||||
the same structure. The structure for members is as follows:
|
||||
is the size of the structure and union.
|
||||
There are two different structures which are used to encode members in the
|
||||
variable list.
|
||||
When the size of a structure or union is greater than or equal to the large
|
||||
member threshold, 8192, then a different structure is used to encode the member,
|
||||
all members are encoded using the same structure.
|
||||
The structure for members is as follows:
|
||||
.Bd -literal
|
||||
typedef struct ctf_member {
|
||||
uint_t ctm_name; /* reference to name in string table */
|
||||
@ -961,44 +1019,52 @@ Both the
|
||||
.Em ctm_name
|
||||
and
|
||||
.Em ctlm_name
|
||||
refer to the name of the member. The name is encoded as an offset into the
|
||||
string table as described by the section
|
||||
refer to the name of the member.
|
||||
The name is encoded as an offset into the string table as described by the
|
||||
section
|
||||
.Sx String Identifiers .
|
||||
The members
|
||||
.Sy ctm_type
|
||||
and
|
||||
.Sy ctlm_type
|
||||
both refer to the type of the member. They are encoded as per the section
|
||||
both refer to the type of the member.
|
||||
They are encoded as per the section
|
||||
.Sx Type Identifiers .
|
||||
.Lp
|
||||
The last piece of information that is present is the offset which describes the
|
||||
offset in memory that the member begins at. For unions, this value will always
|
||||
be zero because the start of unions in memory is always zero. For structures,
|
||||
this is the offset in
|
||||
offset in memory that the member begins at.
|
||||
For unions, this value will always be zero because the start of unions in memory
|
||||
is always zero.
|
||||
For structures, this is the offset in
|
||||
.Sy bits
|
||||
that the member begins at. Note that a compiler may lay out a type with padding.
|
||||
that the member begins at.
|
||||
Note that a compiler may lay out a type with padding.
|
||||
This means that the difference in offset between two consecutive members may be
|
||||
larger than the size of the member. When the size of the overall structure is
|
||||
strictly less than 8192 bytes, the normal structure,
|
||||
larger than the size of the member.
|
||||
When the size of the overall structure is strictly less than 8192 bytes, the
|
||||
normal structure,
|
||||
.Sy ctf_member_t ,
|
||||
is used and the offset in bits is stored in the member
|
||||
.Em ctm_offset .
|
||||
However, when the size of the structure is greater than or equal to 8192 bytes,
|
||||
then the number of bits is split into two 32-bit quantities. One member,
|
||||
then the number of bits is split into two 32-bit quantities.
|
||||
One member,
|
||||
.Em ctlm_offsethi ,
|
||||
represents the upper 32 bits of the offset, while the other member,
|
||||
.Em ctlm_offsetlo ,
|
||||
represents the lower 32 bits of the offset. These can be joined together to get
|
||||
a 64-bit sized offset in bits by shifting the member
|
||||
represents the lower 32 bits of the offset.
|
||||
These can be joined together to get a 64-bit sized offset in bits by shifting
|
||||
the member
|
||||
.Em ctlm_offsethi
|
||||
to the left by thirty two and then doing a binary or of
|
||||
.Em ctlm_offsetlo .
|
||||
.Ss Encoding of Enumerations
|
||||
Enumerations, noted by the type
|
||||
.Sy CTF_K_ENUM ,
|
||||
are similar to structures. Enumerations use the variable list to note the number
|
||||
of values that the enumeration contains, which we'll term enumerators. In C, an
|
||||
enumeration is always equivalent to the intrinsic type
|
||||
are similar to structures.
|
||||
Enumerations use the variable list to note the number of values that the
|
||||
enumeration contains, which we'll term enumerators.
|
||||
In C, an enumeration is always equivalent to the intrinsic type
|
||||
.Sy int ,
|
||||
thus the value of the member
|
||||
.Em ctt_size
|
||||
@ -1032,25 +1098,27 @@ Forward references, types of kind
|
||||
.Sy CTF_K_FORWARD ,
|
||||
in a
|
||||
.Nm
|
||||
file refer to types which may not have a definition at all, only a name. If
|
||||
the
|
||||
file refer to types which may not have a definition at all, only a name.
|
||||
If the
|
||||
.Nm
|
||||
file is a child, then it may be that the forward is resolved to an
|
||||
actual type in the parent, otherwise the definition may be in another
|
||||
.Nm
|
||||
container or may not be known at all. The only member of the
|
||||
container or may not be known at all.
|
||||
The only member of the
|
||||
.Sy ctf_type_t
|
||||
that matters for a forward declaration is the
|
||||
.Em ctt_name
|
||||
which points to the name of the forward reference in the string table as
|
||||
described earlier. There is no other information recorded for forward
|
||||
references.
|
||||
described earlier.
|
||||
There is no other information recorded for forward references.
|
||||
.Ss Encoding of Pointers, Typedefs, Volatile, Const, and Restrict
|
||||
Pointers, typedefs, volatile, const, and restrict are all similar in
|
||||
.Nm .
|
||||
They all refer to another type. In the case of typedefs, they provide an
|
||||
alternate name, while volatile, const, and restrict change how the type is
|
||||
interpreted in the C programming language. This covers the
|
||||
They all refer to another type.
|
||||
In the case of typedefs, they provide an alternate name, while volatile, const,
|
||||
and restrict change how the type is interpreted in the C programming language.
|
||||
This covers the
|
||||
.Nm
|
||||
kinds
|
||||
.Sy CTF_K_POINTER ,
|
||||
@ -1066,43 +1134,49 @@ to refer to the base type that they modify.
|
||||
.Ss Encoding of Unknown Types
|
||||
Types with the kind
|
||||
.Sy CTF_K_UNKNOWN
|
||||
are used to indicate gaps in the type identifier space. These entries consume an
|
||||
identifier, but do not define anything. Nothing should refer to these gap
|
||||
identifiers.
|
||||
are used to indicate gaps in the type identifier space.
|
||||
These entries consume an identifier, but do not define anything.
|
||||
Nothing should refer to these gap identifiers.
|
||||
.Ss Dependencies Between Types
|
||||
C types can be imagined as a directed, cyclic, graph. Structures and unions may
|
||||
refer to each other in a way that creates a cyclic dependency. In cases such as
|
||||
these, the entire type section must be read in and processed. Consumers must
|
||||
not assume that every type can be laid out in dependency order; they
|
||||
cannot.
|
||||
C types can be imagined as a directed, cyclic, graph.
|
||||
Structures and unions may refer to each other in a way that creates a cyclic
|
||||
dependency.
|
||||
In cases such as these, the entire type section must be read in and processed.
|
||||
Consumers must not assume that every type can be laid out in dependency order;
|
||||
they cannot.
|
||||
.Ss The String Section
|
||||
The last section of the
|
||||
.Nm
|
||||
file is the
|
||||
.Sy string
|
||||
section. This section encodes all of the strings that appear throughout
|
||||
the other sections. It is laid out as a series of characters followed by
|
||||
a null terminator. Generally, all names are written out in ASCII, as
|
||||
most C compilers do not allow and characters to appear in identifiers
|
||||
outside of a subset of ASCII. However, any extended characters sets
|
||||
should be written out as a series of UTF-8 bytes.
|
||||
section.
|
||||
This section encodes all of the strings that appear throughout the other
|
||||
sections.
|
||||
It is laid out as a series of characters followed by a null terminator.
|
||||
Generally, all names are written out in ASCII, as most C compilers do not allow
|
||||
and characters to appear in identifiers outside of a subset of ASCII.
|
||||
However, any extended characters sets should be written out as a series of UTF-8
|
||||
bytes.
|
||||
.Lp
|
||||
The first entry in the section, at offset zero, is a single null
|
||||
terminator to reference the empty string. Following that, each C string
|
||||
should be written out, including the null terminator. Offsets that refer
|
||||
to something in this section should refer to the first byte which begins
|
||||
a string. Beyond the first byte in the section being the null
|
||||
terminator, the order of strings is unimportant.
|
||||
.Ss Data Encoding and ELF Considerations
|
||||
terminator to reference the empty string.
|
||||
Following that, each C string should be written out, including the null
|
||||
terminator.
|
||||
Offsets that refer to something in this section should refer to the first byte
|
||||
which begins a string.
|
||||
Beyond the first byte in the section being the null terminator, the order of
|
||||
strings is unimportant.
|
||||
.Sh Data Encoding and ELF Considerations
|
||||
.Nm
|
||||
data is generally included in ELF objects which specify information to
|
||||
identify the architecture and endianness of the file. A
|
||||
identify the architecture and endianness of the file.
|
||||
A
|
||||
.Nm
|
||||
container inside such an object must match the endianness of the ELF
|
||||
object. Aside from the question of the endian encoding of data, there
|
||||
should be no other differences between architectures. While many of the
|
||||
types in this document refer to non-fixed size C integral types, they
|
||||
are equivalent in the models
|
||||
container inside such an object must match the endianness of the ELF object.
|
||||
Aside from the question of the endian encoding of data, there should be no other
|
||||
differences between architectures.
|
||||
While many of the types in this document refer to non-fixed size C integral
|
||||
types, they are equivalent in the models
|
||||
.Sy ILP32
|
||||
and
|
||||
.Sy LP64 .
|
||||
@ -1118,15 +1192,16 @@ When placing a
|
||||
container inside of an ELF object, there are certain conventions that are
|
||||
expected for the purposes of tooling being able to find the
|
||||
.Nm
|
||||
data. In particular, a given ELF object should only contain a single
|
||||
data.
|
||||
In particular, a given ELF object should only contain a single
|
||||
.Nm
|
||||
section. Multiple containers should be merged together into a single
|
||||
one.
|
||||
section.
|
||||
Multiple containers should be merged together into a single one.
|
||||
.Lp
|
||||
The
|
||||
.Nm
|
||||
file should be included in its own ELF section. The section's name
|
||||
must be
|
||||
file should be included in its own ELF section.
|
||||
The section's name must be
|
||||
.Ql .SUNW_ctf .
|
||||
The type of the section must be
|
||||
.Sy SHT_PROGBITS .
|
||||
|
Loading…
Reference in New Issue
Block a user