1994-05-27 12:33:43 +00:00
|
|
|
.\" Copyright (c) 1990, 1993
|
|
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" This code is derived from software contributed to Berkeley by
|
|
|
|
.\" the Institute of Electrical and Electronics Engineers, Inc.
|
|
|
|
.\"
|
|
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
|
|
.\" modification, are permitted provided that the following conditions
|
|
|
|
.\" are met:
|
|
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
|
|
.\" 4. Neither the name of the University nor the names of its contributors
|
|
|
|
.\" may be used to endorse or promote products derived from this software
|
|
|
|
.\" without specific prior written permission.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
.\" SUCH DAMAGE.
|
|
|
|
.\"
|
1997-03-11 13:08:12 +00:00
|
|
|
.\" @(#)join.1 8.3 (Berkeley) 4/28/95
|
1999-08-28 01:08:13 +00:00
|
|
|
.\" $FreeBSD$
|
1994-05-27 12:33:43 +00:00
|
|
|
.\"
|
2004-07-05 13:20:03 +00:00
|
|
|
.Dd July 5, 2004
|
1994-05-27 12:33:43 +00:00
|
|
|
.Dt JOIN 1
|
|
|
|
.Os
|
|
|
|
.Sh NAME
|
|
|
|
.Nm join
|
|
|
|
.Nd relational database operator
|
|
|
|
.Sh SYNOPSIS
|
2000-11-20 19:21:22 +00:00
|
|
|
.Nm
|
1994-05-27 12:33:43 +00:00
|
|
|
.Oo
|
|
|
|
.Fl a Ar file_number | Fl v Ar file_number
|
|
|
|
.Oc
|
|
|
|
.Op Fl e Ar string
|
|
|
|
.Op Fl o Ar list
|
|
|
|
.Op Fl t Ar char
|
2006-12-21 10:59:48 +00:00
|
|
|
.Op Fl 1 Ar field
|
|
|
|
.Op Fl 2 Ar field
|
1994-05-27 12:33:43 +00:00
|
|
|
.Ar file1
|
|
|
|
.Ar file2
|
|
|
|
.Sh DESCRIPTION
|
2000-03-26 14:25:51 +00:00
|
|
|
The
|
|
|
|
.Nm
|
2000-03-27 20:33:32 +00:00
|
|
|
utility performs an
|
|
|
|
.Dq equality join
|
|
|
|
on the specified files
|
1994-05-27 12:33:43 +00:00
|
|
|
and writes the result to the standard output.
|
2000-03-27 20:33:32 +00:00
|
|
|
The
|
|
|
|
.Dq join field
|
|
|
|
is the field in each file by which the files are compared.
|
1994-05-27 12:33:43 +00:00
|
|
|
The first field in each line is used by default.
|
|
|
|
There is one line in the output for each pair of lines in
|
|
|
|
.Ar file1
|
|
|
|
and
|
|
|
|
.Ar file2
|
|
|
|
which have identical join fields.
|
|
|
|
Each output line consists of the join field, the remaining fields from
|
|
|
|
.Ar file1
|
|
|
|
and then the remaining fields from
|
|
|
|
.Ar file2 .
|
|
|
|
.Pp
|
|
|
|
The default field separators are tab and space characters.
|
|
|
|
In this case, multiple tabs and spaces count as a single field separator,
|
|
|
|
and leading tabs and spaces are ignored.
|
|
|
|
The default output field separator is a single space character.
|
|
|
|
.Pp
|
|
|
|
Many of the options use file and field numbers.
|
2004-07-02 22:22:35 +00:00
|
|
|
Both file numbers and field numbers are 1 based, i.e., the first file on
|
1994-05-27 12:33:43 +00:00
|
|
|
the command line is file number 1 and the first field is field number 1.
|
|
|
|
The following options are available:
|
2000-03-26 14:25:51 +00:00
|
|
|
.Bl -tag -width indent
|
1994-05-27 12:33:43 +00:00
|
|
|
.It Fl a Ar file_number
|
|
|
|
In addition to the default output, produce a line for each unpairable
|
|
|
|
line in file
|
|
|
|
.Ar file_number .
|
|
|
|
.It Fl e Ar string
|
|
|
|
Replace empty output fields with
|
|
|
|
.Ar string .
|
|
|
|
.It Fl o Ar list
|
|
|
|
The
|
|
|
|
.Fl o
|
|
|
|
option specifies the fields that will be output from each file for
|
|
|
|
each line with matching join fields.
|
|
|
|
Each element of
|
|
|
|
.Ar list
|
2006-12-21 10:59:48 +00:00
|
|
|
has either the form
|
|
|
|
.Ar file_number . Ns Ar field ,
|
1994-05-27 12:33:43 +00:00
|
|
|
where
|
|
|
|
.Ar file_number
|
|
|
|
is a file number and
|
|
|
|
.Ar field
|
2002-03-26 03:44:04 +00:00
|
|
|
is a field number, or the form
|
|
|
|
.Ql 0
|
|
|
|
.Pq zero ,
|
|
|
|
representing the join field.
|
2000-03-27 20:33:32 +00:00
|
|
|
The elements of list must be either comma
|
2002-11-26 11:25:04 +00:00
|
|
|
.Pq Ql \&,
|
2000-03-27 20:33:32 +00:00
|
|
|
or whitespace separated.
|
2012-05-18 03:30:50 +00:00
|
|
|
(The latter requires quoting to protect it from the shell, or, a simpler
|
1994-05-27 12:33:43 +00:00
|
|
|
approach is to use multiple
|
|
|
|
.Fl o
|
|
|
|
options.)
|
|
|
|
.It Fl t Ar char
|
|
|
|
Use character
|
|
|
|
.Ar char
|
|
|
|
as a field delimiter for both input and output.
|
|
|
|
Every occurrence of
|
|
|
|
.Ar char
|
|
|
|
in a line is significant.
|
|
|
|
.It Fl v Ar file_number
|
|
|
|
Do not display the default output, but display a line for each unpairable
|
|
|
|
line in file
|
|
|
|
.Ar file_number .
|
|
|
|
The options
|
2006-12-21 10:59:48 +00:00
|
|
|
.Fl v Cm 1
|
1994-05-27 12:33:43 +00:00
|
|
|
and
|
2006-12-21 10:59:48 +00:00
|
|
|
.Fl v Cm 2
|
1994-05-27 12:33:43 +00:00
|
|
|
may be specified at the same time.
|
|
|
|
.It Fl 1 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
2006-12-21 10:59:48 +00:00
|
|
|
field of
|
|
|
|
.Ar file1 .
|
1994-05-27 12:33:43 +00:00
|
|
|
.It Fl 2 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
2006-12-21 10:59:48 +00:00
|
|
|
field of
|
|
|
|
.Ar file2 .
|
1994-05-27 12:33:43 +00:00
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
When the default field delimiter characters are used, the files to be joined
|
|
|
|
should be ordered in the collating sequence of
|
|
|
|
.Xr sort 1 ,
|
|
|
|
using the
|
|
|
|
.Fl b
|
|
|
|
option, on the fields on which they are to be joined, otherwise
|
2000-03-26 14:25:51 +00:00
|
|
|
.Nm
|
1994-05-27 12:33:43 +00:00
|
|
|
may not report all field matches.
|
|
|
|
When the field delimiter characters are specified by the
|
|
|
|
.Fl t
|
|
|
|
option, the collating sequence should be the same as
|
2000-03-26 14:25:51 +00:00
|
|
|
.Xr sort 1
|
1994-05-27 12:33:43 +00:00
|
|
|
without the
|
|
|
|
.Fl b
|
|
|
|
option.
|
|
|
|
.Pp
|
|
|
|
If one of the arguments
|
|
|
|
.Ar file1
|
|
|
|
or
|
|
|
|
.Ar file2
|
2000-03-27 20:33:32 +00:00
|
|
|
is
|
2006-12-21 10:59:48 +00:00
|
|
|
.Sq Fl ,
|
2000-03-27 20:33:32 +00:00
|
|
|
the standard input is used.
|
2005-01-17 07:44:44 +00:00
|
|
|
.Sh EXIT STATUS
|
2001-08-15 09:09:47 +00:00
|
|
|
.Ex -std
|
1994-05-27 12:33:43 +00:00
|
|
|
.Sh COMPATIBILITY
|
|
|
|
For compatibility with historic versions of
|
2000-11-20 19:21:22 +00:00
|
|
|
.Nm ,
|
1994-05-27 12:33:43 +00:00
|
|
|
the following options are available:
|
2000-03-26 14:25:51 +00:00
|
|
|
.Bl -tag -width indent
|
1994-05-27 12:33:43 +00:00
|
|
|
.It Fl a
|
|
|
|
In addition to the default output, produce a line for each unpairable line
|
2006-12-21 10:59:48 +00:00
|
|
|
in both
|
|
|
|
.Ar file1
|
|
|
|
and
|
|
|
|
.Ar file2 .
|
1994-05-27 12:33:43 +00:00
|
|
|
.It Fl j1 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
2006-12-21 10:59:48 +00:00
|
|
|
field of
|
|
|
|
.Ar file1 .
|
1994-05-27 12:33:43 +00:00
|
|
|
.It Fl j2 Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
2006-12-21 10:59:48 +00:00
|
|
|
field of
|
|
|
|
.Ar file2 .
|
1994-05-27 12:33:43 +00:00
|
|
|
.It Fl j Ar field
|
|
|
|
Join on the
|
|
|
|
.Ar field Ns 'th
|
2006-12-21 10:59:48 +00:00
|
|
|
field of both
|
|
|
|
.Ar file1
|
|
|
|
and
|
|
|
|
.Ar file2 .
|
1994-05-27 12:33:43 +00:00
|
|
|
.It Fl o Ar list ...
|
|
|
|
Historical implementations of
|
2000-03-26 14:25:51 +00:00
|
|
|
.Nm
|
1994-05-27 12:33:43 +00:00
|
|
|
permitted multiple arguments to the
|
|
|
|
.Fl o
|
|
|
|
option.
|
2000-03-26 14:25:51 +00:00
|
|
|
These arguments were of the form
|
2006-12-21 10:59:48 +00:00
|
|
|
.Ar file_number . Ns Ar field_number
|
2000-03-26 14:25:51 +00:00
|
|
|
as described
|
1994-05-27 12:33:43 +00:00
|
|
|
for the current
|
|
|
|
.Fl o
|
|
|
|
option.
|
2000-03-26 14:25:51 +00:00
|
|
|
This has obvious difficulties in the presence of files named
|
2006-12-21 10:59:48 +00:00
|
|
|
.Pa 1.2 .
|
1994-05-27 12:33:43 +00:00
|
|
|
.El
|
|
|
|
.Pp
|
2006-12-21 10:59:48 +00:00
|
|
|
These options are available only so historic shell scripts do not require
|
1994-05-27 12:33:43 +00:00
|
|
|
modification and should not be used.
|
|
|
|
.Sh SEE ALSO
|
|
|
|
.Xr awk 1 ,
|
|
|
|
.Xr comm 1 ,
|
|
|
|
.Xr paste 1 ,
|
|
|
|
.Xr sort 1 ,
|
|
|
|
.Xr uniq 1
|
2005-01-18 13:43:56 +00:00
|
|
|
.Sh STANDARDS
|
|
|
|
The
|
|
|
|
.Nm
|
|
|
|
command conforms to
|
|
|
|
.St -p1003.1-2001 .
|