Menggunakan awk untuk itu. File uji:
$ cat a.txt
one
two
three
four
four
$ cat b.txt
three
two
one
Awk:
$ awk '
NR==FNR { # process b.txt or the first file
seen[$0] # hash words to hash seen
next # next word in b.txt
} # process a.txt or all files after the first
!($0 in seen)' b.txt a.txt # if word is not hashed to seen, output it
Duplikat dihasilkan:
four
four
Untuk menghindari duplikat, tambahkan setiap kata yang baru bertemu di a.txt ke seen
hash:
$ awk '
NR==FNR {
seen[$0]
next
}
!($0 in seen) { # if word is not hashed to seen
seen[$0] # hash unseen a.txt words to seen to avoid duplicates
print # and output it
}' b.txt a.txt
Keluaran:
four
Jika daftar kata dipisahkan dengan koma, seperti:
$ cat a.txt
four,four,three,three,two,one
five,six
$ cat b.txt
one,two,three
Anda harus melakukan beberapa putaran ekstra ( for
putaran):
awk -F, ' # comma-separated input
NR==FNR {
for(i=1;i<=NF;i++) # loop all comma-separated fields
seen[$i]
next
}
{
for(i=1;i<=NF;i++)
if(!($i in seen)) {
seen[$i] # this time we buffer output (below):
buffer=buffer (buffer==""?"":",") $i
}
if(buffer!="") { # output unempty buffers after each record in a.txt
print buffer
buffer=""
}
}' b.txt a.txt
Keluarkan saat ini:
four
five,six
diff a.txt b.txt
tidak cukup?