Jawaban:
Gunakan set
jika Anda tidak peduli tentang pesanan barang atau pengulangan. Gunakan pemahaman daftar jika Anda melakukannya:
>>> def diff(first, second):
second = set(second)
return [item for item in first if item not in second]
>>> diff(A, B)
[1, 3, 4]
>>> diff(B, A)
[5]
>>>
set
ke B tidak berbahaya, tetapi menerapkannya ke A
dan menggunakan hasilnya bukan yang asli A
tidak.
Jika pesanan tidak menjadi masalah, Anda cukup menghitung perbedaan yang ditetapkan:
>>> set([1,2,3,4]) - set([2,5])
set([1, 4, 3])
>>> set([2,5]) - set([1,2,3,4])
set([5])
Anda bisa melakukan
list(set(A)-set(B))
dan
list(set(B)-set(A))
Satu liner:
diff = lambda l1,l2: [x for x in l1 if x not in l2]
diff(A,B)
diff(B,A)
Atau:
diff = lambda l1,l2: filter(lambda x: x not in l2, l1)
diff(A,B)
diff(B,A)
Contoh-contoh di atas meremehkan masalah penghitungan perbedaan. Dengan asumsi penyortiran atau de-duplikasi pasti membuatnya lebih mudah untuk menghitung perbedaan, tetapi jika perbandingan Anda tidak mampu asumsi itu maka Anda akan memerlukan implementasi algoritma diff yang tidak sepele. Lihat difflib di pustaka standar python.
from difflib import SequenceMatcher
squeeze=SequenceMatcher( None, A, B )
print "A - B = [%s]"%( reduce( lambda p,q: p+q,
map( lambda t: squeeze.a[t[1]:t[2]],
filter(lambda x:x[0]!='equal',
squeeze.get_opcodes() ) ) ) )
A - B = [[1, 3, 4]]
print
telah berubah dari perintah ke fungsi, dan reduce
, filter
dan map
telah dinyatakan unpythonic. (Dan saya pikir Guido mungkin benar - saya juga tidak mengerti apa yang reduce
dilakukan.)
Python 2.7.3 (default, 27 Feb 2014, 19:58:35) - IPython 1.1.0 - timeit: (github gist )
def diff(a, b):
b = set(b)
return [aa for aa in a if aa not in b]
def set_diff(a, b):
return list(set(a) - set(b))
diff_lamb_hension = lambda l1,l2: [x for x in l1 if x not in l2]
diff_lamb_filter = lambda l1,l2: filter(lambda x: x not in l2, l1)
from difflib import SequenceMatcher
def squeezer(a, b):
squeeze = SequenceMatcher(None, a, b)
return reduce(lambda p,q: p+q, map(
lambda t: squeeze.a[t[1]:t[2]],
filter(lambda x:x[0]!='equal',
squeeze.get_opcodes())))
Hasil:
# Small
a = range(10)
b = range(10/2)
timeit[diff(a, b)]
100000 loops, best of 3: 1.97 µs per loop
timeit[set_diff(a, b)]
100000 loops, best of 3: 2.71 µs per loop
timeit[diff_lamb_hension(a, b)]
100000 loops, best of 3: 2.1 µs per loop
timeit[diff_lamb_filter(a, b)]
100000 loops, best of 3: 3.58 µs per loop
timeit[squeezer(a, b)]
10000 loops, best of 3: 36 µs per loop
# Medium
a = range(10**4)
b = range(10**4/2)
timeit[diff(a, b)]
1000 loops, best of 3: 1.17 ms per loop
timeit[set_diff(a, b)]
1000 loops, best of 3: 1.27 ms per loop
timeit[diff_lamb_hension(a, b)]
1 loops, best of 3: 736 ms per loop
timeit[diff_lamb_filter(a, b)]
1 loops, best of 3: 732 ms per loop
timeit[squeezer(a, b)]
100 loops, best of 3: 12.8 ms per loop
# Big
a = xrange(10**7)
b = xrange(10**7/2)
timeit[diff(a, b)]
1 loops, best of 3: 1.74 s per loop
timeit[set_diff(a, b)]
1 loops, best of 3: 2.57 s per loop
timeit[diff_lamb_filter(a, b)]
# too long to wait for
timeit[diff_lamb_filter(a, b)]
# too long to wait for
timeit[diff_lamb_filter(a, b)]
# TypeError: sequence index must be integer, not 'slice'
@ roman-bodnarchuk daftar fungsi dif def def (a, b) tampaknya lebih cepat.
A = [1,2,3,4]
B = [2,5]
#A - B
x = list(set(A) - set(B))
#B - A
y = list(set(B) - set(A))
print x
print y
Anda ingin menggunakan set
bukan list
.
Jika Anda ingin perbedaannya secara mendalam masuk ke item daftar Anda, saya telah menulis paket untuk python: https://github.com/erasmose/deepdiff
Instal dari PyPi:
pip install deepdiff
Jika Anda Python3, Anda juga harus menginstal:
pip install future six
>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function
Objek yang sama kembali kosong
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> ddiff = DeepDiff(t1, t2)
>>> print (ddiff.changes)
{}
Jenis item telah berubah
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> ddiff = DeepDiff(t1, t2)
>>> print (ddiff.changes)
{'type_changes': ["root[2]: 2=<type 'int'> vs. 2=<type 'str'>"]}
Nilai suatu barang telah berubah
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> ddiff = DeepDiff(t1, t2)
>>> print (ddiff.changes)
{'values_changed': ['root[2]: 2 ====>> 4']}
Item ditambahkan dan / atau dihapus
>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff.changes)
{'dic_item_added': ['root[5, 6]'],
'dic_item_removed': ['root[4]'],
'values_changed': ['root[2]: 2 ====>> 4']}
Perbedaan string
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff.changes, indent = 2)
{ 'values_changed': [ 'root[2]: 2 ====>> 4',
"root[4]['b']:\n--- \n+++ \n@@ -1 +1 @@\n-world\n+world!"]}
>>>
>>> print (ddiff.changes['values_changed'][1])
root[4]['b']:
---
+++
@@ -1 +1 @@
-world
+world!
Perbedaan string 2
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff.changes, indent = 2)
{ 'values_changed': [ "root[4]['b']:\n--- \n+++ \n@@ -1,5 +1,4 @@\n-world!\n-Goodbye!\n+world\n 1\n 2\n End"]}
>>>
>>> print (ddiff.changes['values_changed'][0])
root[4]['b']:
---
+++
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
1
2
End
Ketik perubahan
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff.changes, indent = 2)
{ 'type_changes': [ "root[4]['b']: [1, 2, 3]=<type 'list'> vs. world\n\n\nEnd=<type 'str'>"]}
Daftar perbedaan
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff.changes, indent = 2)
{ 'list_removed': ["root[4]['b']: [3]"]}
Daftar perbedaan 2: Perhatikan bahwa JANGAN memperhitungkan pesanan
>>> # Note that it DOES NOT take order into account
... t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff.changes, indent = 2)
{ }
Daftar yang berisi kamus:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff.changes, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
'values_changed': ["root[4]['b'][2][1]: 1 ====>> 3"]}
cara paling sederhana,
gunakan set (). perbedaan (set ())
list_a = [1,2,3]
list_b = [2,3]
print set(list_a).difference(set(list_b))
jawabannya adalah set([1])
Dalam kasus daftar kamus , solusi pemahaman daftar lengkap berfungsi saat set
solusi meningkat
TypeError: unhashable type: 'dict'
def diff(a, b):
return [aa for aa in a if aa not in b]
d1 = {"a":1, "b":1}
d2 = {"a":2, "b":2}
d3 = {"a":3, "b":3}
>>> diff([d1, d2, d3], [d2, d3])
[{'a': 1, 'b': 1}]
>>> diff([d1, d2, d3], [d1])
[{'a': 2, 'b': 2}, {'a': 3, 'b': 3}]
Ketika melihat TimeComplexity of In-operator, dalam kasus terburuk berfungsi dengan O (n). Bahkan untuk Set.
Jadi ketika membandingkan dua array kita akan memiliki TimeComplexity dari O (n) dalam kasus terbaik dan O (n ^ 2) dalam kasus terburuk.
Solusi alternatif (tapi sayangnya lebih kompleks), yang bekerja dengan O (n) dalam kasus terbaik dan terburuk adalah yang ini:
# Compares the difference of list a and b
# uses a callback function to compare items
def diff(a, b, callback):
a_missing_in_b = []
ai = 0
bi = 0
a = sorted(a, callback)
b = sorted(b, callback)
while (ai < len(a)) and (bi < len(b)):
cmp = callback(a[ai], b[bi])
if cmp < 0:
a_missing_in_b.append(a[ai])
ai += 1
elif cmp > 0:
# Item b is missing in a
bi += 1
else:
# a and b intersecting on this item
ai += 1
bi += 1
# if a and b are not of same length, we need to add the remaining items
for ai in xrange(ai, len(a)):
a_missing_in_b.append(a[ai])
return a_missing_in_b
misalnya
>>> a=[1,2,3]
>>> b=[2,4,6]
>>> diff(a, b, cmp)
[1, 3]
set(b)
untuk menggunakan untuk memastikan algoritmenya adalah O (nlogn) dan bukan Theta (n ^ 2)