freebsd-dev/contrib/mdocml/mandoc.1
2012-10-19 22:21:01 +00:00

670 lines
14 KiB
Groff

.\" $Id: mandoc.1,v 1.100 2011/12/25 19:35:44 kristaps Exp $
.\"
.\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
.\"
.\" Permission to use, copy, modify, and distribute this software for any
.\" purpose with or without fee is hereby granted, provided that the above
.\" copyright notice and this permission notice appear in all copies.
.\"
.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
.\"
.Dd $Mdocdate: December 25 2011 $
.Dt MANDOC 1
.Os
.Sh NAME
.Nm mandoc
.Nd format and display UNIX manuals
.Sh SYNOPSIS
.Nm mandoc
.Op Fl V
.Op Fl m Ns Ar format
.Op Fl O Ns Ar option
.Op Fl T Ns Ar output
.Op Fl W Ns Ar level
.Op Ar
.Sh DESCRIPTION
The
.Nm
utility formats
.Ux
manual pages for display.
.Pp
By default,
.Nm
reads
.Xr mdoc 7
or
.Xr man 7
text from stdin, implying
.Fl m Ns Cm andoc ,
and produces
.Fl T Ns Cm ascii
output.
.Pp
The arguments are as follows:
.Bl -tag -width Ds
.It Fl m Ns Ar format
Input format.
See
.Sx Input Formats
for available formats.
Defaults to
.Fl m Ns Cm andoc .
.It Fl O Ns Ar option
Comma-separated output options.
.It Fl T Ns Ar output
Output format.
See
.Sx Output Formats
for available formats.
Defaults to
.Fl T Ns Cm ascii .
.It Fl V
Print version and exit.
.It Fl W Ns Ar level
Specify the minimum message
.Ar level
to be reported on the standard error output and to affect the exit status.
The
.Ar level
can be
.Cm warning ,
.Cm error ,
or
.Cm fatal .
The default is
.Fl W Ns Cm fatal ;
.Fl W Ns Cm all
is an alias for
.Fl W Ns Cm warning .
See
.Sx EXIT STATUS
and
.Sx DIAGNOSTICS
for details.
.Pp
The special option
.Fl W Ns Cm stop
tells
.Nm
to exit after parsing a file that causes warnings or errors of at least
the requested level.
No formatted output will be produced from that file.
If both a
.Ar level
and
.Cm stop
are requested, they can be joined with a comma, for example
.Fl W Ns Cm error , Ns Cm stop .
.It Ar file
Read input from zero or more files.
If unspecified, reads from stdin.
If multiple files are specified,
.Nm
will halt with the first failed parse.
.El
.Ss Input Formats
The
.Nm
utility accepts
.Xr mdoc 7
and
.Xr man 7
input with
.Fl m Ns Cm doc
and
.Fl m Ns Cm an ,
respectively.
The
.Xr mdoc 7
format is
.Em strongly
recommended;
.Xr man 7
should only be used for legacy manuals.
.Pp
A third option,
.Fl m Ns Cm andoc ,
which is also the default, determines encoding on-the-fly: if the first
non-comment macro is
.Sq \&Dd
or
.Sq \&Dt ,
the
.Xr mdoc 7
parser is used; otherwise, the
.Xr man 7
parser is used.
.Pp
If multiple
files are specified with
.Fl m Ns Cm andoc ,
each has its file-type determined this way.
If multiple files are
specified and
.Fl m Ns Cm doc
or
.Fl m Ns Cm an
is specified, then this format is used exclusively.
.Ss Output Formats
The
.Nm
utility accepts the following
.Fl T
arguments, which correspond to output modes:
.Bl -tag -width "-Tlocale"
.It Fl T Ns Cm ascii
Produce 7-bit ASCII output.
This is the default.
See
.Sx ASCII Output .
.It Fl T Ns Cm html
Produce strict CSS1/HTML-4.01 output.
See
.Sx HTML Output .
.It Fl T Ns Cm lint
Parse only: produce no output.
Implies
.Fl W Ns Cm warning .
.It Fl T Ns Cm locale
Encode output using the current locale.
See
.Sx Locale Output .
.It Fl T Ns Cm man
Produce
.Xr man 7
format output.
See
.Sx Man Output .
.It Fl T Ns Cm pdf
Produce PDF output.
See
.Sx PDF Output .
.It Fl T Ns Cm ps
Produce PostScript output.
See
.Sx PostScript Output .
.It Fl T Ns Cm tree
Produce an indented parse tree.
.It Fl T Ns Cm utf8
Encode output in the UTF\-8 multi-byte format.
See
.Sx UTF\-8 Output .
.It Fl T Ns Cm xhtml
Produce strict CSS1/XHTML-1.0 output.
See
.Sx XHTML Output .
.El
.Pp
If multiple input files are specified, these will be processed by the
corresponding filter in-order.
.Ss ASCII Output
Output produced by
.Fl T Ns Cm ascii ,
which is the default, is rendered in standard 7-bit ASCII documented in
.Xr ascii 7 .
.Pp
Font styles are applied by using back-spaced encoding such that an
underlined character
.Sq c
is rendered as
.Sq _ Ns \e[bs] Ns c ,
where
.Sq \e[bs]
is the back-space character number 8.
Emboldened characters are rendered as
.Sq c Ns \e[bs] Ns c .
.Pp
The special characters documented in
.Xr mandoc_char 7
are rendered best-effort in an ASCII equivalent.
If no equivalent is found,
.Sq \&?
is used instead.
.Pp
Output width is limited to 78 visible columns unless literal input lines
exceed this limit.
.Pp
The following
.Fl O
arguments are accepted:
.Bl -tag -width Ds
.It Cm indent Ns = Ns Ar indent
The left margin for normal text is set to
.Ar indent
blank characters instead of the default of five for
.Xr mdoc 7
and seven for
.Xr man 7 .
Increasing this is not recommended; it may result in degraded formatting,
for example overfull lines or ugly line breaks.
.It Cm width Ns = Ns Ar width
The output width is set to
.Ar width ,
which will normalise to \(>=60.
.El
.Ss HTML Output
Output produced by
.Fl T Ns Cm html
conforms to HTML-4.01 strict.
.Pp
The
.Pa example.style.css
file documents style-sheet classes available for customising output.
If a style-sheet is not specified with
.Fl O Ns Ar style ,
.Fl T Ns Cm html
defaults to simple output readable in any graphical or text-based web
browser.
.Pp
Special characters are rendered in decimal-encoded UTF\-8.
.Pp
The following
.Fl O
arguments are accepted:
.Bl -tag -width Ds
.It Cm fragment
Omit the
.Aq !DOCTYPE
declaration and the
.Aq html ,
.Aq head ,
and
.Aq body
elements and only emit the subtree below the
.Aq body
element.
The
.Cm style
argument will be ignored.
This is useful when embedding manual content within existing documents.
.It Cm includes Ns = Ns Ar fmt
The string
.Ar fmt ,
for example,
.Ar ../src/%I.html ,
is used as a template for linked header files (usually via the
.Sq \&In
macro).
Instances of
.Sq \&%I
are replaced with the include filename.
The default is not to present a
hyperlink.
.It Cm man Ns = Ns Ar fmt
The string
.Ar fmt ,
for example,
.Ar ../html%S/%N.%S.html ,
is used as a template for linked manuals (usually via the
.Sq \&Xr
macro).
Instances of
.Sq \&%N
and
.Sq %S
are replaced with the linked manual's name and section, respectively.
If no section is included, section 1 is assumed.
The default is not to
present a hyperlink.
.It Cm style Ns = Ns Ar style.css
The file
.Ar style.css
is used for an external style-sheet.
This must be a valid absolute or
relative URI.
.El
.Ss Locale Output
Locale-depending output encoding is triggered with
.Fl T Ns Cm locale .
This option is not available on all systems: systems without locale
support, or those whose internal representation is not natively UCS-4,
will fall back to
.Fl T Ns Cm ascii .
See
.Sx ASCII Output
for font style specification and available command-line arguments.
.Ss Man Output
Translate input format into
.Xr man 7
output format.
This is useful for distributing manual sources to legancy systems
lacking
.Xr mdoc 7
formatters.
.Pp
If
.Xr mdoc 7
is passed as input, it is translated into
.Xr man 7 .
If the input format is
.Xr man 7 ,
the input is copied to the output, expanding any
.Xr roff 7
.Sq so
requests.
The parser is also run, and as usual, the
.Fl W
level controls which
.Sx DIAGNOSTICS
are displayed before copying the input to the output.
.Ss PDF Output
PDF-1.1 output may be generated by
.Fl T Ns Cm pdf .
See
.Sx PostScript Output
for
.Fl O
arguments and defaults.
.Ss PostScript Output
PostScript
.Qq Adobe-3.0
Level-2 pages may be generated by
.Fl T Ns Cm ps .
Output pages default to letter sized and are rendered in the Times font
family, 11-point.
Margins are calculated as 1/9 the page length and width.
Line-height is 1.4m.
.Pp
Special characters are rendered as in
.Sx ASCII Output .
.Pp
The following
.Fl O
arguments are accepted:
.Bl -tag -width Ds
.It Cm paper Ns = Ns Ar name
The paper size
.Ar name
may be one of
.Ar a3 ,
.Ar a4 ,
.Ar a5 ,
.Ar legal ,
or
.Ar letter .
You may also manually specify dimensions as
.Ar NNxNN ,
width by height in millimetres.
If an unknown value is encountered,
.Ar letter
is used.
.El
.Ss UTF\-8 Output
Use
.Fl T Ns Cm utf8
to force a UTF\-8 locale.
See
.Sx Locale Output
for details and options.
.Ss XHTML Output
Output produced by
.Fl T Ns Cm xhtml
conforms to XHTML-1.0 strict.
.Pp
See
.Sx HTML Output
for details; beyond generating XHTML tags instead of HTML tags, these
output modes are identical.
.Sh EXIT STATUS
The
.Nm
utility exits with one of the following values, controlled by the message
.Ar level
associated with the
.Fl W
option:
.Pp
.Bl -tag -width Ds -compact
.It 0
No warnings or errors occurred, or those that did were ignored because
they were lower than the requested
.Ar level .
.It 2
At least one warning occurred, but no error, and
.Fl W Ns Cm warning
was specified.
.It 3
At least one parsing error occurred, but no fatal error, and
.Fl W Ns Cm error
or
.Fl W Ns Cm warning
was specified.
.It 4
A fatal parsing error occurred.
.It 5
Invalid command line arguments were specified.
No input files have been read.
.It 6
An operating system error occurred, for example memory exhaustion or an
error accessing input files.
Such errors cause
.Nm
to exit at once, possibly in the middle of parsing or formatting a file.
.El
.Pp
Note that selecting
.Fl T Ns Cm lint
output mode implies
.Fl W Ns Cm warning .
.Sh EXAMPLES
To page manuals to the terminal:
.Pp
.Dl $ mandoc \-Wall,stop mandoc.1 2\*(Gt&1 | less
.Dl $ mandoc mandoc.1 mdoc.3 mdoc.7 | less
.Pp
To produce HTML manuals with
.Ar style.css
as the style-sheet:
.Pp
.Dl $ mandoc \-Thtml -Ostyle=style.css mdoc.7 \*(Gt mdoc.7.html
.Pp
To check over a large set of manuals:
.Pp
.Dl $ mandoc \-Tlint `find /usr/src -name \e*\e.[1-9]`
.Pp
To produce a series of PostScript manuals for A4 paper:
.Pp
.Dl $ mandoc \-Tps \-Opaper=a4 mdoc.7 man.7 \*(Gt manuals.ps
.Pp
Convert a modern
.Xr mdoc 7
manual to the older
.Xr man 7
format, for use on systems lacking an
.Xr mdoc 7
parser:
.Pp
.Dl $ mandoc \-Tman foo.mdoc \*(Gt foo.man
.Sh DIAGNOSTICS
Standard error messages reporting parsing errors are prefixed by
.Pp
.Sm off
.D1 Ar file : line : column : \ level :
.Sm on
.Pp
where the fields have the following meanings:
.Bl -tag -width "column"
.It Ar file
The name of the input file causing the message.
.It Ar line
The line number in that input file.
Line numbering starts at 1.
.It Ar column
The column number in that input file.
Column numbering starts at 1.
If the issue is caused by a word, the column number usually
points to the first character of the word.
.It Ar level
The message level, printed in capital letters.
.El
.Pp
Message levels have the following meanings:
.Bl -tag -width "warning"
.It Cm fatal
The parser is unable to parse a given input file at all.
No formatted output is produced from that input file.
.It Cm error
An input file contains syntax that cannot be safely interpreted,
either because it is invalid or because
.Nm
does not implement it yet.
By discarding part of the input or inserting missing tokens,
the parser is able to continue, and the error does not prevent
generation of formatted output, but typically, preparing that
output involves information loss, broken document structure
or unintended formatting.
.It Cm warning
An input file uses obsolete, discouraged or non-portable syntax.
All the same, the meaning of the input is unambiguous and a correct
rendering can be produced.
Documents causing warnings may render poorly when using other
formatting tools instead of
.Nm .
.El
.Pp
Messages of the
.Cm warning
and
.Cm error
levels are hidden unless their level, or a lower level, is requested using a
.Fl W
option or
.Fl T Ns Cm lint
output mode.
.Pp
The
.Nm
utility may also print messages related to invalid command line arguments
or operating system errors, for example when memory is exhausted or
input files cannot be read.
Such messages do not carry the prefix described above.
.Sh COMPATIBILITY
This section summarises
.Nm
compatibility with GNU troff.
Each input and output format is separately noted.
.Ss ASCII Compatibility
.Bl -bullet -compact
.It
Unrenderable unicode codepoints specified with
.Sq \e[uNNNN]
escapes are printed as
.Sq \&?
in mandoc.
In GNU troff, these raise an error.
.It
The
.Sq \&Bd \-literal
and
.Sq \&Bd \-unfilled
macros of
.Xr mdoc 7
in
.Fl T Ns Cm ascii
are synonyms, as are \-filled and \-ragged.
.It
In historic GNU troff, the
.Sq \&Pa
.Xr mdoc 7
macro does not underline when scoped under an
.Sq \&It
in the FILES section.
This behaves correctly in
.Nm .
.It
A list or display following the
.Sq \&Ss
.Xr mdoc 7
macro in
.Fl T Ns Cm ascii
does not assert a prior vertical break, just as it doesn't with
.Sq \&Sh .
.It
The
.Sq \&na
.Xr man 7
macro in
.Fl T Ns Cm ascii
has no effect.
.It
Words aren't hyphenated.
.El
.Ss HTML/XHTML Compatibility
.Bl -bullet -compact
.It
The
.Sq \efP
escape will revert the font to the previous
.Sq \ef
escape, not to the last rendered decoration, which is now dictated by
CSS instead of hard-coded.
It also will not span past the current scope,
for the same reason.
Note that in
.Sx ASCII Output
mode, this will work fine.
.It
The
.Xr mdoc 7
.Sq \&Bl \-hang
and
.Sq \&Bl \-tag
list types render similarly (no break following overreached left-hand
side) due to the expressive constraints of HTML.
.It
The
.Xr man 7
.Sq IP
and
.Sq TP
lists render similarly.
.El
.Sh SEE ALSO
.Xr eqn 7 ,
.Xr man 7 ,
.Xr mandoc_char 7 ,
.Xr mdoc 7 ,
.Xr roff 7 ,
.Xr tbl 7
.Sh AUTHORS
The
.Nm
utility was written by
.An Kristaps Dzonsons ,
.Mt kristaps@bsd.lv .
.Sh CAVEATS
In
.Fl T Ns Cm html
and
.Fl T Ns Cm xhtml ,
the maximum size of an element attribute is determined by
.Dv BUFSIZ ,
which is usually 1024 bytes.
Be aware of this when setting long link
formats such as
.Fl O Ns Cm style Ns = Ns Ar really/long/link .
.Pp
Nesting elements within next-line element scopes of
.Fl m Ns Cm an ,
such as
.Sq br
within an empty
.Sq B ,
will confuse
.Fl T Ns Cm html
and
.Fl T Ns Cm xhtml
and cause them to forget the formatting of the prior next-line scope.
.Pp
The
.Sq \(aq
control character is an alias for the standard macro control character
and does not emit a line-break as stipulated in GNU troff.