uniq

remove duplicate lines from a sorted file

uniq [-c] [-d |-u] [-i] [-f n] [-s n] [input [output]]

Ignore all but one of successive identical lines from input (or standard input), writing to output (or standard output).
Compares adjacent lines and writes unique (or duplicate) lines

Beware of trailing blankes on SOME lines.

-c
--count
count of occurances is prepended to output lines.
-d
--repeated
duplicate lines only, one for each group
-D none|prepend|seperate
--all-repeated
all duplicate lines are output.
Delimiting is done with blank lines.
-f f
--skip-fields=f
skip fields, beginning with 1,
delimited by whitespace.
(which might contain a line number or a time/date stamp)
Fields are skipped before chars.
-s c
--skip-chars=c
With -f, the first c characters after the first f fields will be skipped.
-w n
--check-chars=n
width of field to check
Not on Mac
-i
--ignore-case
--help
--version

Seems that the delimiter MUST be whitespace. Maybe sed "s/,/+ +/" | uniq | sed "s/+ +/,/"

cat 00
11
11
2
3
4
44
44
5 
  uniq 00
11
2
3
4
44
5 
 uniq -c 00
      2 11
      1 2
      1 3
      1 4
      2 44
      1 5 
cat 00
11
11
2
3
4
44
44
5 
 uniq -u 00
2
3
4
5 
cat 00
11
11
2
3
4
44
44
5 
  uniq -D 00
11
11
44
44 

tersified by Dennis German


If input is a single dash (-) or absent, the stdout is read.
Only adjacent lines are compared, so it may be necessary to sort the file(s) first.
Beware of trailing blanks on SOME lines.

When skipping Characters or Fields, outputting Duplicates only the first line with duplicate fields is output.

There seems to be no way to only compare first n characters, see sort.

Examples

> cat -n ffff

1  11
2  11
3  2
4  3
5  4
6  44
7  44
8  5 
 > uniq ffff

11
2
3
4
44
5 
 
> uniq -d ffff
11
44

> uniq -u ffff

2
3
4
5 
 
> uniq -c ffff
   1 
   2 11
   1 2
   1 3
   1 4
   2 44
   1 5 
   1  

Environment

$LANG, $LC_ALL, $LC_COLLATE and $LC_CTYPE affect the execution as described in environ