1997-03-11 13:08:12 +00:00
|
|
|
.\" Copyright (c) 1990, 1993
|
|
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" This code is derived from software contributed to Berkeley by
|
|
|
|
.\" the Institute of Electrical and Electronics Engineers, Inc.
|
|
|
|
.\"
|
|
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
|
|
.\" modification, are permitted provided that the following conditions
|
|
|
|
.\" are met:
|
|
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
|
|
.\" 3. All advertising materials mentioning features or use of this software
|
|
|
|
.\" must display the following acknowledgement:
|
|
|
|
.\" This product includes software developed by the University of
|
|
|
|
.\" California, Berkeley and its contributors.
|
|
|
|
.\" 4. Neither the name of the University nor the names of its contributors
|
|
|
|
.\" may be used to endorse or promote products derived from this software
|
|
|
|
.\" without specific prior written permission.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
.\" SUCH DAMAGE.
|
|
|
|
.\"
|
|
|
|
.\" @(#)join.1 8.3 (Berkeley) 4/28/95
|
1999-08-28 01:08:13 +00:00
|
|
|
.\" $FreeBSD$
|
1997-03-11 13:08:12 +00:00
|
|
|
.\"
|
|
|
|
.Dd April 28, 1995
|
|
|
|
.Dt JOIN 1
|
|
|
|
.Os
|
|
|
|
.Sh NAME
|
|
|
|
.Nm join
|
|
|
|
.Nd relational database operator
|
|
|
|
.Sh SYNOPSIS
|
2000-11-20 19:21:22 +00:00
|
|
|
.Nm
|
1997-03-11 13:08:12 +00:00
|
|
|
.Oo
|
|
|
|
.Fl a Ar file_number | Fl v Ar file_number
|
|
|
|
.Oc
|
|
|
|
.Op Fl e Ar string
|
|
|
|
.Op Fl j Ar file_number field
|
|
|
|
.Op Fl o Ar list
|
|
|
|
.Bk -words
|
|
|
|
.Ek
|
|
|
|
.Op Fl t Ar char
|
|
|
|
.Op Fl \&1 Ar field
|
|
|
|
.Op Fl \&2 Ar field
|
|
|
|
.Ar file1
|
|
|
|
.Ar file2
|
|
|
|
.Sh DESCRIPTION
|
2000-03-26 14:25:51 +00:00
|
|
|
The
|
|
|
|
.Nm
|
2000-03-27 20:33:32 +00:00
|
|
|
utility performs an
|
|
|
|
.Dq equality join
|
|
|
|
on the specified files
|
1997-03-11 13:08:12 +00:00
|
|
|
and writes the result to the standard output.
|
2000-03-27 20:33:32 +00:00
|
|
|
The
|
|
|
|
.Dq join field
|
|
|
|
is the field in each file by which the files are compared.
|
1997-03-11 13:08:12 +00:00
|
|
|
The first field in each line is used by default.
|
|
|
|
There is one line in the output for each pair of lines in
|
|
|
|
.Ar file1
|
|
|
|
and
|
|
|
|
.Ar file2
|
|
|
|
which have identical join fields.
|
|
|
|
Each output line consists of the join field, the remaining fields from
|
|
|
|
.Ar file1
|
|
|
|
and then the remaining fields from
|
|
|
|
.Ar file2 .
|
|
|
|
.Pp
|
|
|
|
The default field separators are tab and space characters.
|
|
|
|
In this case, multiple tabs and spaces count as a single field separator,
|
|
|
|
and leading tabs and spaces are ignored.
|
|
|
|
The default output field separator is a single space character.
|
|
|
|
.Pp
|
|
|
|
Many of the options use file and field numbers.
|
|
|
|
Both file numbers and field numbers are 1 based, i.e. the first file on
|
|
|
|
the command line is file number 1 and the first field is field number 1.
|
|
|
|
The following options are available:
|
2000-03-26 14:25:51 +00:00
|
|
|
.Bl -tag -width indent
|
1997-03-11 13:08:12 +00:00
|
|
|
.It Fl a Ar file_number
|
|
|
|
In addition to the default output, produce a line for each unpairable
|
|
|
|
line in file
|
|
|
|
.Ar file_number .
|
|
|
|
(The argument to
|
|
|
|
.Fl a
|
|
|
|
must not be preceded by a space; see the
|
|
|
|
.Sx COMPATIBILITY
|
|
|
|
section.)
|
|
|
|
.It Fl e Ar string
|
|
|
|
Replace empty output fields with
|
|
|
|
.Ar string .
|
|
|
|
.It Fl o Ar list
|
|
|
|
The
|
|
|
|
.Fl o
|
|
|
|
option specifies the fields that will be output from each file for
|
|
|
|
each line with matching join fields.
|
|
|
|
Each element of
|
|
|
|
.Ar list
|
|
|
|
has the form
|
|
|
|
.Ql file_number.field ,
|
|
|
|
where
|
|
|
|
.Ar file_number
|
|
|
|
is a file number and
|
|
|
|
.Ar field
|
|
|
|
is a field number.
|
2000-03-27 20:33:32 +00:00
|
|
|
The elements of list must be either comma
|
|
|
|
.Pf ( Dq , Ns )
|
|
|
|
or whitespace separated.
|
1997-03-11 13:08:12 +00:00
|
|
|
(The latter requires quoting to protect it from the shell, or, a simpler
|
|
|
|
approach is to use multiple
|
|
|
|
.Fl o
|
|
|
|
options.)
|
|
|
|
.It Fl t Ar char
|
|
|
|
Use character
|
|
|
|
.Ar char
|
|
|
|
as a field delimiter for both input and output.
|
|
|
|
Every occurrence of
|
|
|
|
.Ar char
|
|
|
|
in a line is significant.
|
|
|
|
.It Fl v Ar file_number
|
|
|
|
Do not display the default output, but display a line for each unpairable
|
|
|
|
line in file
|
|
|
|
.Ar file_number .
|
|
|
|
The options
|
|
|
|
.Fl v Ar 1
|
|
|
|
and
|
|
|
|
.Fl v Ar 2
|
|
|
|
may be specified at the same time.
|
|
|
|
.It Fl 1 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
|
|
|
field of file 1.
|
|
|
|
.It Fl 2 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
|
|
|
field of file 2.
|
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
When the default field delimiter characters are used, the files to be joined
|
|
|
|
should be ordered in the collating sequence of
|
|
|
|
.Xr sort 1 ,
|
|
|
|
using the
|
|
|
|
.Fl b
|
|
|
|
option, on the fields on which they are to be joined, otherwise
|
2000-03-26 14:25:51 +00:00
|
|
|
.Nm
|
1997-03-11 13:08:12 +00:00
|
|
|
may not report all field matches.
|
|
|
|
When the field delimiter characters are specified by the
|
|
|
|
.Fl t
|
|
|
|
option, the collating sequence should be the same as
|
2000-03-26 14:25:51 +00:00
|
|
|
.Xr sort 1
|
1997-03-11 13:08:12 +00:00
|
|
|
without the
|
|
|
|
.Fl b
|
|
|
|
option.
|
|
|
|
.Pp
|
|
|
|
If one of the arguments
|
|
|
|
.Ar file1
|
|
|
|
or
|
|
|
|
.Ar file2
|
2000-03-27 20:33:32 +00:00
|
|
|
is
|
|
|
|
.Dq - ,
|
|
|
|
the standard input is used.
|
2000-03-26 14:25:51 +00:00
|
|
|
.Sh DIAGNOSTICS
|
1997-03-11 13:08:12 +00:00
|
|
|
The
|
2000-03-26 14:25:51 +00:00
|
|
|
.Nm
|
1997-03-11 13:08:12 +00:00
|
|
|
utility exits 0 on success, and >0 if an error occurs.
|
|
|
|
.Sh COMPATIBILITY
|
|
|
|
For compatibility with historic versions of
|
2000-11-20 19:21:22 +00:00
|
|
|
.Nm ,
|
1997-03-11 13:08:12 +00:00
|
|
|
the following options are available:
|
2000-03-26 14:25:51 +00:00
|
|
|
.Bl -tag -width indent
|
1997-03-11 13:08:12 +00:00
|
|
|
.It Fl a
|
|
|
|
In addition to the default output, produce a line for each unpairable line
|
|
|
|
in both file 1 and file 2.
|
|
|
|
(To distinguish between this and
|
|
|
|
.Fl a Ar file_number ,
|
2000-03-26 14:25:51 +00:00
|
|
|
.Nm
|
1997-03-11 13:08:12 +00:00
|
|
|
currently requires that the latter not include any white space.)
|
|
|
|
.It Fl j1 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
|
|
|
field of file 1.
|
|
|
|
.It Fl j2 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
|
|
|
field of file 2.
|
|
|
|
.It Fl j Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
|
|
|
field of both file 1 and file 2.
|
|
|
|
.It Fl o Ar list ...
|
|
|
|
Historical implementations of
|
2000-03-26 14:25:51 +00:00
|
|
|
.Nm
|
1997-03-11 13:08:12 +00:00
|
|
|
permitted multiple arguments to the
|
|
|
|
.Fl o
|
|
|
|
option.
|
2000-03-26 14:25:51 +00:00
|
|
|
These arguments were of the form
|
|
|
|
.Ql file_number.field_number
|
|
|
|
as described
|
1997-03-11 13:08:12 +00:00
|
|
|
for the current
|
|
|
|
.Fl o
|
|
|
|
option.
|
2000-03-26 14:25:51 +00:00
|
|
|
This has obvious difficulties in the presence of files named
|
|
|
|
.Ql 1.2 .
|
1997-03-11 13:08:12 +00:00
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
These options are available only so historic shellscripts don't require
|
|
|
|
modification and should not be used.
|
|
|
|
.Sh STANDARDS
|
|
|
|
The
|
2000-03-26 14:25:51 +00:00
|
|
|
.Nm
|
1997-03-11 13:08:12 +00:00
|
|
|
command is expected to be
|
|
|
|
.St -p1003.2
|
|
|
|
compatible.
|
|
|
|
.Sh SEE ALSO
|
|
|
|
.Xr awk 1 ,
|
|
|
|
.Xr comm 1 ,
|
|
|
|
.Xr paste 1 ,
|
|
|
|
.Xr sort 1 ,
|
|
|
|
.Xr uniq 1
|