1994-05-27 12:33:43 +00:00
|
|
|
.\" Copyright (c) 1991, 1993
|
|
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" This code is derived from software contributed to Berkeley by
|
|
|
|
.\" the Institute of Electrical and Electronics Engineers, Inc.
|
|
|
|
.\"
|
|
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
|
|
.\" modification, are permitted provided that the following conditions
|
|
|
|
.\" are met:
|
|
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
|
|
.\" documentation and/or other materials provided with the distribution.
|
2017-02-28 23:42:47 +00:00
|
|
|
.\" 3. Neither the name of the University nor the names of its contributors
|
1994-05-27 12:33:43 +00:00
|
|
|
.\" may be used to endorse or promote products derived from this software
|
|
|
|
.\" without specific prior written permission.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
.\" SUCH DAMAGE.
|
|
|
|
.\"
|
1997-09-07 15:09:22 +00:00
|
|
|
.\" From: @(#)uniq.1 8.1 (Berkeley) 6/6/93
|
1999-08-28 01:08:13 +00:00
|
|
|
.\" $FreeBSD$
|
1994-05-27 12:33:43 +00:00
|
|
|
.\"
|
2020-06-07 13:21:47 +00:00
|
|
|
.Dd June 7, 2020
|
1994-05-27 12:33:43 +00:00
|
|
|
.Dt UNIQ 1
|
|
|
|
.Os
|
|
|
|
.Sh NAME
|
|
|
|
.Nm uniq
|
|
|
|
.Nd report or filter out repeated lines in a file
|
|
|
|
.Sh SYNOPSIS
|
1997-08-21 06:51:10 +00:00
|
|
|
.Nm
|
2019-12-15 18:05:18 +00:00
|
|
|
.Op Fl c | Fl d | Fl D | Fl u
|
1997-09-07 15:09:22 +00:00
|
|
|
.Op Fl i
|
2002-05-24 19:12:02 +00:00
|
|
|
.Op Fl f Ar num
|
1994-05-27 12:33:43 +00:00
|
|
|
.Op Fl s Ar chars
|
|
|
|
.Oo
|
|
|
|
.Ar input_file
|
|
|
|
.Op Ar output_file
|
|
|
|
.Oc
|
|
|
|
.Sh DESCRIPTION
|
|
|
|
The
|
1997-08-21 06:51:10 +00:00
|
|
|
.Nm
|
2002-05-30 00:07:14 +00:00
|
|
|
utility reads the specified
|
|
|
|
.Ar input_file
|
|
|
|
comparing adjacent lines, and writes a copy of each unique input line to
|
|
|
|
the
|
|
|
|
.Ar output_file .
|
|
|
|
If
|
|
|
|
.Ar input_file
|
|
|
|
is a single dash
|
2002-11-26 17:33:37 +00:00
|
|
|
.Pq Sq Fl
|
2002-05-30 00:07:14 +00:00
|
|
|
or absent, the standard input is read.
|
|
|
|
If
|
|
|
|
.Ar output_file
|
|
|
|
is absent, standard output is used for output.
|
1994-05-27 12:33:43 +00:00
|
|
|
The second and succeeding copies of identical adjacent input lines are
|
|
|
|
not written.
|
|
|
|
Repeated lines in the input will not be detected if they are not adjacent,
|
|
|
|
so it may be necessary to sort the files first.
|
|
|
|
.Pp
|
|
|
|
The following options are available:
|
|
|
|
.Bl -tag -width Ds
|
2018-05-02 01:17:08 +00:00
|
|
|
.It Fl c , Fl -count
|
1994-05-27 12:33:43 +00:00
|
|
|
Precede each output line with the count of the number of times the line
|
|
|
|
occurred in the input, followed by a single space.
|
2018-05-02 01:17:08 +00:00
|
|
|
.It Fl d , Fl -repeated
|
2019-12-15 18:05:18 +00:00
|
|
|
Output a single copy of each line that is repeated in the input.
|
|
|
|
.It Fl D , Fl -all-repeated Op Ar septype
|
|
|
|
Output all lines that are repeated (like
|
|
|
|
.Fl d ,
|
|
|
|
but each copy of the repeated line is written).
|
|
|
|
The optional
|
|
|
|
.Ar septype
|
|
|
|
argument controls how to separate groups of repeated lines in the output;
|
|
|
|
it must be one of the following values:
|
|
|
|
.Pp
|
|
|
|
.Bl -tag -compact -width separate
|
|
|
|
.It none
|
|
|
|
Do not separate groups of lines (this is the default).
|
|
|
|
.It prepend
|
|
|
|
Output an empty line before each group of lines.
|
|
|
|
.It separate
|
|
|
|
Output an empty line after each group of lines.
|
|
|
|
.El
|
2018-05-02 01:17:08 +00:00
|
|
|
.It Fl f Ar num , Fl -skip-fields Ar num
|
1994-05-27 12:33:43 +00:00
|
|
|
Ignore the first
|
2002-05-21 16:54:58 +00:00
|
|
|
.Ar num
|
2002-05-24 19:12:02 +00:00
|
|
|
fields in each input line when doing comparisons.
|
1994-05-27 12:33:43 +00:00
|
|
|
A field is a string of non-blank characters separated from adjacent fields
|
|
|
|
by blanks.
|
2004-07-02 22:22:35 +00:00
|
|
|
Field numbers are one based, i.e., the first field is field one.
|
2018-05-02 01:17:08 +00:00
|
|
|
.It Fl i , Fl -ignore-case
|
|
|
|
Case insensitive comparison of lines.
|
|
|
|
.It Fl s Ar chars , Fl -skip-chars Ar chars
|
1994-05-27 12:33:43 +00:00
|
|
|
Ignore the first
|
|
|
|
.Ar chars
|
|
|
|
characters in each input line when doing comparisons.
|
|
|
|
If specified in conjunction with the
|
2018-05-02 01:17:08 +00:00
|
|
|
.Fl f , Fl -unique
|
1994-05-27 12:33:43 +00:00
|
|
|
option, the first
|
|
|
|
.Ar chars
|
|
|
|
characters after the first
|
2002-05-21 16:54:58 +00:00
|
|
|
.Ar num
|
1994-05-27 12:33:43 +00:00
|
|
|
fields will be ignored.
|
2004-07-02 22:22:35 +00:00
|
|
|
Character numbers are one based, i.e., the first character is character one.
|
2018-05-02 01:17:08 +00:00
|
|
|
.It Fl u , Fl -unique
|
1999-03-15 02:57:29 +00:00
|
|
|
Only output lines that are not repeated in the input.
|
1994-05-27 12:33:43 +00:00
|
|
|
.\".It Fl Ns Ar n
|
|
|
|
.\"(Deprecated; replaced by
|
|
|
|
.\".Fl f ) .
|
|
|
|
.\"Ignore the first n
|
|
|
|
.\"fields on each input line when doing comparisons,
|
|
|
|
.\"where n is a number.
|
|
|
|
.\"A field is a string of non-blank
|
|
|
|
.\"characters separated from adjacent fields
|
|
|
|
.\"by blanks.
|
|
|
|
.\".It Cm \&\(pl Ns Ar n
|
|
|
|
.\"(Deprecated; replaced by
|
|
|
|
.\".Fl s ) .
|
|
|
|
.\"Ignore the first
|
|
|
|
.\".Ar m
|
|
|
|
.\"characters when doing comparisons, where
|
|
|
|
.\".Ar m
|
|
|
|
.\"is a
|
|
|
|
.\"number.
|
|
|
|
.El
|
2003-04-12 04:17:14 +00:00
|
|
|
.Sh ENVIRONMENT
|
|
|
|
The
|
|
|
|
.Ev LANG ,
|
|
|
|
.Ev LC_ALL ,
|
|
|
|
.Ev LC_COLLATE
|
|
|
|
and
|
|
|
|
.Ev LC_CTYPE
|
|
|
|
environment variables affect the execution of
|
|
|
|
.Nm
|
|
|
|
as described in
|
|
|
|
.Xr environ 7 .
|
2005-01-17 07:44:44 +00:00
|
|
|
.Sh EXIT STATUS
|
2001-08-15 09:09:47 +00:00
|
|
|
.Ex -std
|
2020-06-07 13:21:47 +00:00
|
|
|
.Sh EXAMPLES
|
|
|
|
Assuming a file named cities.txt with the following content:
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
Madrid
|
|
|
|
Lisbon
|
|
|
|
Madrid
|
|
|
|
.Ed
|
|
|
|
.Pp
|
|
|
|
The following command reports three different lines since identical elements
|
|
|
|
are not adjacent:
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
$ uniq -u cities.txt
|
|
|
|
Madrid
|
|
|
|
Lisbon
|
|
|
|
Madrid
|
|
|
|
.Ed
|
|
|
|
.Pp
|
|
|
|
Sort the file and count the number of identical lines:
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
$ sort cities.txt | uniq -c
|
|
|
|
1 Lisbon
|
|
|
|
2 Madrid
|
|
|
|
.Ed
|
|
|
|
.Pp
|
|
|
|
Assuming the following content for the file cities.txt:
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
madrid
|
|
|
|
Madrid
|
|
|
|
Lisbon
|
|
|
|
.Ed
|
|
|
|
.Pp
|
|
|
|
Show repeated lines ignoring case sensitiveness:
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
$ uniq -d -i cities.txt
|
|
|
|
madrid
|
|
|
|
.Ed
|
|
|
|
.Pp
|
|
|
|
Same as above but showing the whole group of repeated lines:
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
$ uniq -D -i cities.txt
|
|
|
|
madrid
|
|
|
|
Madrid
|
|
|
|
.Ed
|
|
|
|
.Pp
|
|
|
|
Report the number of identical lines ignoring the first character of every line:
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
$ uniq -s 1 -c cities.txt
|
|
|
|
2 madrid
|
|
|
|
1 Lisbon
|
|
|
|
.Ed
|
1994-05-27 12:33:43 +00:00
|
|
|
.Sh COMPATIBILITY
|
|
|
|
The historic
|
|
|
|
.Cm \&\(pl Ns Ar number
|
|
|
|
and
|
|
|
|
.Fl Ns Ar number
|
|
|
|
options have been deprecated but are still supported in this implementation.
|
|
|
|
.Sh SEE ALSO
|
|
|
|
.Xr sort 1
|
|
|
|
.Sh STANDARDS
|
|
|
|
The
|
1997-08-21 06:51:10 +00:00
|
|
|
.Nm
|
2003-04-12 04:17:14 +00:00
|
|
|
utility conforms to
|
|
|
|
.St -p1003.1-2001
|
2004-07-02 22:22:35 +00:00
|
|
|
as amended by Cor.\& 1-2002.
|
2002-07-05 09:44:47 +00:00
|
|
|
.Sh HISTORY
|
|
|
|
A
|
|
|
|
.Nm
|
|
|
|
command appeared in
|
|
|
|
.At v3 .
|