Merge and sum of two dictionaries
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
add a comment |
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
add a comment |
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
I have a dictionary below, and I want to add to another dictionary with not necessarily distinct elements and merge it's results. Is there any built-in function for this, or will I need to make my own?
{
'6d6e7bf221ae24e07ab90bba4452267b05db7824cd3fd1ea94b2c9a8': 6,
'7c4a462a6ed4a3070b6d78d97c90ac230330603d24a58cafa79caf42': 7,
'9c37bdc9f4750dd7ee2b558d6c06400c921f4d74aabd02ed5b4ddb38': 9,
'd3abb28d5776aef6b728920b5d7ff86fa3a71521a06538d2ad59375a': 15,
'2ca9e1f9cbcd76a5ce1772f9b59995fd32cbcffa8a3b01b5c9c8afc2': 11
}
The number of elements in the dictionary is also unknown.
Where the merge considers two identical keys, the values of these keys should be summed instead of overwritten.
python dictionary
python dictionary
edited Jun 5 '18 at 11:57
Clemens Tolboom
814919
814919
asked May 5 '12 at 11:45
badc0rebadc0re
1,42342239
1,42342239
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
add a comment |
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
11
11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05
add a comment |
9 Answers
9
active
oldest
votes
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 '18 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }print(diff)And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}I am concerned about theonly_yvalue above, as it changed to negative200instead of retaining200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 '18 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -band then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 '18 at 0:51
add a comment |
You can perform +, -, &, and | (intersection and union) on collections.Counter().
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
1
What if'both1': 0inxandyand I want to have'both1': 0inz? With this solution there would be no'both1'key inz.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 '18 at 6:59
add a comment |
You could use defaultdict for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
add a comment |
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter() behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
add a comment |
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
add a comment |
I suspect you're looking for dict's update method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict. A list of tuples e.g.[(1,2),(3,4)]would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update():
merged = dict(d1, **d2)
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__, we have defined how to use the operator + for our dict_merge which inherits from the inbuilt python dict. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. * with __mul__ for multiplying, or / with __div__ for dividing, or even % with __mod__ for modulo, and replacing the + in self[key] + other[key] with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
add a comment |
If you want to create a new dict as | use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
add a comment |
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f10461531%2fmerge-and-sum-of-two-dictionaries%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
9 Answers
9
active
oldest
votes
9 Answers
9
active
oldest
votes
active
oldest
votes
active
oldest
votes
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 '18 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }print(diff)And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}I am concerned about theonly_yvalue above, as it changed to negative200instead of retaining200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 '18 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -band then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 '18 at 0:51
add a comment |
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 '18 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }print(diff)And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}I am concerned about theonly_yvalue above, as it changed to negative200instead of retaining200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 '18 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -band then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 '18 at 0:51
add a comment |
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
You didn't say how exactly you want to merge, so take your pick:
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) & set(y) }
print { k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y) }
Results:
{'both2': 22, 'only_x': 100, 'both1': 11}
{'both2': 22, 'both1': 11}
{'only_y': 200, 'both2': 22, 'both1': 11, 'only_x': 100}
answered May 5 '12 at 12:38
georggeorg
153k35207305
153k35207305
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 '18 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }print(diff)And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}I am concerned about theonly_yvalue above, as it changed to negative200instead of retaining200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 '18 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -band then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 '18 at 0:51
add a comment |
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 '18 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.ex-y.diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) }print(diff)And this gives me :{'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9}I am concerned about theonly_yvalue above, as it changed to negative200instead of retaining200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?
– Panchu
Sep 29 '18 at 22:29
@Panchu: how aboutsub = lambda a, b: a if b is None else b if a is None else a -band then{k: sub(x.get(k), y.get(k)) for ... etc
– georg
Sep 30 '18 at 0:51
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 '18 at 18:57
how do we implement this if we have n number of dictionaries ?
– Tony Mathew
Sep 23 '18 at 18:57
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e
x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?– Panchu
Sep 29 '18 at 22:29
I liked this approach. However in my case, for the same above dictionary values, I am trying to take the difference. i.e
x-y. diff= { k: x.get(k, 0) - y.get(k, 0) for k in set(x) | set(y) } print(diff) And this gives me : {'only_y': -200, 'both2': -18, 'only_x': 100, 'both1': -9} I am concerned about the only_y value above, as it changed to negative 200 instead of retaining 200. Even though you already answered the actual question, could you please suggest the better way of catching the negative values for the keys that are unique?– Panchu
Sep 29 '18 at 22:29
@Panchu: how about
sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc– georg
Sep 30 '18 at 0:51
@Panchu: how about
sub = lambda a, b: a if b is None else b if a is None else a -b and then {k: sub(x.get(k), y.get(k)) for ... etc– georg
Sep 30 '18 at 0:51
add a comment |
You can perform +, -, &, and | (intersection and union) on collections.Counter().
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
1
What if'both1': 0inxandyand I want to have'both1': 0inz? With this solution there would be no'both1'key inz.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 '18 at 6:59
add a comment |
You can perform +, -, &, and | (intersection and union) on collections.Counter().
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
1
What if'both1': 0inxandyand I want to have'both1': 0inz? With this solution there would be no'both1'key inz.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 '18 at 6:59
add a comment |
You can perform +, -, &, and | (intersection and union) on collections.Counter().
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
You can perform +, -, &, and | (intersection and union) on collections.Counter().
So we can do the following (Note: only positive count values will remain in the dictionary):
from collections import Counter
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
z = dict(Counter(x)+Counter(y))
print(z) # {'both2': 22, 'only_x': 100, 'both1': 11, 'only_y': 200}
To address adding values where the result may be zero or negative use Counter.update() for addition and Counter.subtract() for subtraction:
x = {'both1':0, 'both2':2, 'only_x': 100 }
y = {'both1':0, 'both2': -20, 'only_y':200 }
xx = Counter(x)
yy = Counter(y)
xx.update(yy)
dict(xx) # {'both2': -18, 'only_x': 100, 'both1': 0, 'only_y': 200}
edited Jan 5 '17 at 18:01
answered Jun 20 '15 at 4:08
ScottScott
3,13722236
3,13722236
1
What if'both1': 0inxandyand I want to have'both1': 0inz? With this solution there would be no'both1'key inz.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 '18 at 6:59
add a comment |
1
What if'both1': 0inxandyand I want to have'both1': 0inz? With this solution there would be no'both1'key inz.
– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 '18 at 6:59
1
1
What if
'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.– sergej
Jan 5 '17 at 9:16
What if
'both1': 0 in x and y and I want to have 'both1': 0 in z? With this solution there would be no 'both1' key in z.– sergej
Jan 5 '17 at 9:16
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
@sergej That's interesting. Looking at the collections.Counter() link it appears that '+' only keeps positive value counts (> 0). However x.update(y) (where x,y are of type Counter) adds both objects to include 0 and negative value counts. I'll add this to the answer.
– Scott
Jan 5 '17 at 17:48
This is the most pythonic answer.
– BenP
Oct 16 '18 at 6:59
This is the most pythonic answer.
– BenP
Oct 16 '18 at 6:59
add a comment |
You could use defaultdict for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
add a comment |
You could use defaultdict for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
add a comment |
You could use defaultdict for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
You could use defaultdict for this:
from collections import defaultdict
def dsum(*dicts):
ret = defaultdict(int)
for d in dicts:
for k, v in d.items():
ret[k] += v
return dict(ret)
x = {'both1':1, 'both2':2, 'only_x': 100 }
y = {'both1':10, 'both2': 20, 'only_y':200 }
print(dsum(x, y))
This produces
{'both1': 11, 'both2': 22, 'only_x': 100, 'only_y': 200}
answered May 5 '12 at 12:43
NPENPE
357k67760890
357k67760890
add a comment |
add a comment |
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter() behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
add a comment |
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter() behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
add a comment |
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter() behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
Additional notes based on the answers of georg, NPE and Scott.
I was trying to perform this action on collections of 2 or more dictionaries and was interested in seeing the time it took for each. Because I wanted to do this on any number of dictionaries, I had to change some of the answers a bit. If anyone has better suggestions for them, feel free to edit.
Here's my test method. I've updated it recently to include tests with MUCH larger dictionaries:
Firstly I used the following data:
import random
x = {'xy1': 1, 'xy2': 2, 'xyz': 3, 'only_x': 100}
y = {'xy1': 10, 'xy2': 20, 'xyz': 30, 'only_y': 200}
z = {'xyz': 300, 'only_z': 300}
small_tests = [x, y, z]
# 200,000 random 8 letter keys
keys = [''.join(random.choice("abcdefghijklmnopqrstuvwxyz") for _ in range(8)) for _ in range(200000)]
a, b, c = {}, {}, {}
# 50/50 chance of a value being assigned to each dictionary, some keys will be missed but meh
for key in keys:
if random.getrandbits(1):
a[key] = random.randint(0, 1000)
if random.getrandbits(1):
b[key] = random.randint(0, 1000)
if random.getrandbits(1):
c[key] = random.randint(0, 1000)
large_tests = [a, b, c]
print("a:", len(a), "b:", len(b), "c:", len(c))
#: a: 100069 b: 100385 c: 99989
Now each of the methods:
from collections import defaultdict, Counter
def georg_method(tests):
return {k: sum(t.get(k, 0) for t in tests) for k in set.union(*[set(t) for t in tests])}
def georg_method_nosum(tests):
# If you know you will have exactly 3 dicts
return {k: tests[0].get(k, 0) + tests[1].get(k, 0) + tests[2].get(k, 0) for k in set.union(*[set(t) for t in tests])}
def npe_method(tests):
ret = defaultdict(int)
for d in tests:
for k, v in d.items():
ret[k] += v
return dict(ret)
# Note: There is a bug with scott's method. See below for details.
def scott_method(tests):
return dict(sum((Counter(t) for t in tests), Counter()))
def scott_method_nosum(tests):
# If you know you will have exactly 3 dicts
return dict(Counter(tests[0]) + Counter(tests[1]) + Counter(tests[2]))
methods = {"georg_method": georg_method, "georg_method_nosum": georg_method_nosum,
"npe_method": npe_method,
"scott_method": scott_method, "scott_method_nosum": scott_method_nosum}
I also wrote a quick function find whatever differences there were between the lists. Unfortunately, that's when I found the problem in Scott's method, namely, if you have dictionaries that total to 0, the dictionary won't be included at all because of how Counter() behaves when adding.
Finally, the results:
Results: Small Tests
for name, method in methods.items():
print("Method:", name)
%timeit -n10000 method(small_tests)
#: Method: npe_method
#: 10000 loops, best of 3: 5.16 µs per loop
#: Method: georg_method_nosum
#: 10000 loops, best of 3: 8.11 µs per loop
#: Method: georg_method
#: 10000 loops, best of 3: 11.8 µs per loop
#: Method: scott_method_nosum
#: 10000 loops, best of 3: 42.4 µs per loop
#: Method: scott_method
#: 10000 loops, best of 3: 65.3 µs per loop
Results: Large Tests
Naturally, couldn't run anywhere near as many loops
for name, method in methods.items():
print("Method:", name)
%timeit -n10 method(large_tests)
#: Method: npe_method
#: 10 loops, best of 3: 227 ms per loop
#: Method: georg_method_nosum
#: 10 loops, best of 3: 327 ms per loop
#: Method: georg_method
#: 10 loops, best of 3: 455 ms per loop
#: Method: scott_method_nosum
#: 10 loops, best of 3: 510 ms per loop
#: Method: scott_method
#: 10 loops, best of 3: 600 ms per loop
Conclusion
╔═══════════════════════════╦═══════╦═════════════════════════════╗
║ ║ ║ Best of 3 Time Per Loop ║
║ Algorithm ║ By ╠══════════════╦══════════════╣
║ ║ ║ small_tests ║ large_tests ║
╠═══════════════════════════╬═══════╬══════════════╬══════════════╣
║ defaultdict sum ║ NPE ║ 5.16 µs ║ 227,000 µs ║
║ set unions without sum() ║ georg ║ 8.11 µs ║ 327,000 µs ║
║ set unions with sum() ║ ║ 11.8 µs ║ 455,000 µs ║
║ Counter() without sum() ║ Scott ║ 42.4 µs ║ 510,000 µs ║
║ Counter() with sum() ║ ║ 65.3 µs ║ 600,000 µs ║
╚═══════════════════════════╩═══════╩══════════════╩══════════════╝
Important. YMMV.
edited May 23 '17 at 12:18
Community♦
11
11
answered Feb 28 '16 at 23:47
SCBSCB
3,74212234
3,74212234
add a comment |
add a comment |
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
add a comment |
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
add a comment |
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
Another options using a reduce function. This allows to sum-merge an arbitrary collection of dictionaries:
from functools import reduce
collection = [
{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3},
{'a': 4, 'b': 4, 'c': 1},
{'a': 5, 'b': 5, 'c': 1},
{'a': 6, 'b': 6, 'c': 1},
{'a': 7, 'b': 7},
{'a': 8, 'b': 8},
{'a': 9, 'b': 9},
]
def reducer(accumulator, element):
for key, value in element.items():
accumulator[key] = accumulator.get(key, 0) + value
return accumulator
total = reduce(reducer, collection, {})
assert total['a'] == sum(d.get('a', 0) for d in collection)
assert total['b'] == sum(d.get('b', 0) for d in collection)
assert total['c'] == sum(d.get('c', 0) for d in collection)
print(total)
Execution:
{'a': 45, 'b': 45, 'c': 3}
Advantages:
- Simple, clear, Pythonic.
- Schema-less, as long all keys are "sumable".
- O(n) temporal complexity and O(1) memory complexity.
answered Sep 9 '17 at 7:59
HavokHavok
3,83912230
3,83912230
add a comment |
add a comment |
I suspect you're looking for dict's update method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict. A list of tuples e.g.[(1,2),(3,4)]would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
I suspect you're looking for dict's update method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict. A list of tuples e.g.[(1,2),(3,4)]would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
I suspect you're looking for dict's update method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
I suspect you're looking for dict's update method:
>>> d1 = {1:2,3:4}
>>> d2 = {5:6,7:8}
>>> d1.update(d2)
>>> d1
{1: 2, 3: 4, 5: 6, 7: 8}
answered May 5 '12 at 11:50
ziggzigg
13k42749
13k42749
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict. A list of tuples e.g.[(1,2),(3,4)]would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with adict. A list of tuples e.g.[(1,2),(3,4)]would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.
– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
I don't see how you can suspect that when the question does not say anything about merge behavior. update on a dictionary will overwrite values when keys are identical; maybe he's summing unique occurrences of a hash in which case using update is destructive.
– JosefAssad
May 5 '12 at 11:55
1
1
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
Well i have already tried like that but the results doesn't sum
– badc0re
May 5 '12 at 11:57
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
@JosefAssad You are right.
– badc0re
May 5 '12 at 12:02
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a
dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.– zigg
May 5 '12 at 12:03
I took "merge" in the question to mean the same as update. "sum"—which I assume means one ends up with duplicate keys—is something you can't do with a
dict. A list of tuples e.g. [(1,2),(3,4)] would be a start for this. @DameJovanoski: you need to edit your question to explain what you really want to accomplish. My bad for guessing.– zigg
May 5 '12 at 12:03
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
I am sorry for the mess up, i had a bad night yesterday :D
– badc0re
May 5 '12 at 12:13
add a comment |
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update():
merged = dict(d1, **d2)
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update():
merged = dict(d1, **d2)
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update():
merged = dict(d1, **d2)
d1 = {'apples': 2, 'banana': 1}
d2 = {'apples': 3, 'banana': 2}
merged = reduce(
lambda d, i: (
d.update(((i[0], d.get(i[0], 0) + i[1]),)) or d
),
d2.iteritems(),
d1.copy(),
)
There is also pretty simple replacement of dict.update():
merged = dict(d1, **d2)
answered Dec 2 '13 at 19:37
renskiyrenskiy
86699
86699
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
add a comment |
I liked this tip:merged = dict(d1, **d2)
– arannasousa
Jan 13 '17 at 23:34
I liked this tip:
merged = dict(d1, **d2)– arannasousa
Jan 13 '17 at 23:34
I liked this tip:
merged = dict(d1, **d2)– arannasousa
Jan 13 '17 at 23:34
add a comment |
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__, we have defined how to use the operator + for our dict_merge which inherits from the inbuilt python dict. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. * with __mul__ for multiplying, or / with __div__ for dividing, or even % with __mod__ for modulo, and replacing the + in self[key] + other[key] with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
add a comment |
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__, we have defined how to use the operator + for our dict_merge which inherits from the inbuilt python dict. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. * with __mul__ for multiplying, or / with __div__ for dividing, or even % with __mod__ for modulo, and replacing the + in self[key] + other[key] with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
add a comment |
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__, we have defined how to use the operator + for our dict_merge which inherits from the inbuilt python dict. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. * with __mul__ for multiplying, or / with __div__ for dividing, or even % with __mod__ for modulo, and replacing the + in self[key] + other[key] with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
class dict_merge(dict):
def __add__(self, other):
result = dict_merge({})
for key in self.keys():
if key in other.keys():
result[key] = self[key] + other[key]
else:
result[key] = self[key]
for key in other.keys():
if key in self.keys():
pass
else:
result[key] = other[key]
return result
a = dict_merge({"a":2, "b":3, "d":4})
b = dict_merge({"a":1, "b":2})
c = dict_merge({"a":5, "b":6, "c":5})
d = dict_merge({"a":8, "b":6, "e":5})
print((a + b + c +d))
>>> {'a': 16, 'b': 17, 'd': 4, 'c': 5, 'e': 5}
That is operator overloading. Using __add__, we have defined how to use the operator + for our dict_merge which inherits from the inbuilt python dict. You can go ahead and make it more flexible using a similar way to define other operators in this same class e.g. * with __mul__ for multiplying, or / with __div__ for dividing, or even % with __mod__ for modulo, and replacing the + in self[key] + other[key] with the corresponding operator, if you ever find yourself needing such merging.
I have only tested this as it is without other operators but I don't foresee a problem with other operators. Just learn by trying.
answered Apr 25 '17 at 3:01
John MutumaJohn Mutuma
331212
331212
add a comment |
add a comment |
If you want to create a new dict as | use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
add a comment |
If you want to create a new dict as | use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
add a comment |
If you want to create a new dict as | use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
If you want to create a new dict as | use:
>>> dict({'a': 1,'c': 2}, **{'c': 1})
{'a': 1, 'c': 1}
answered Jan 22 '16 at 20:33
Bartosz FoderBartosz Foder
911
911
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f10461531%2fmerge-and-sum-of-two-dictionaries%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown

11
Please get your terminology straight; that's a dict, not a list. Also, what kind of result do you expect, and what have you tried?
– Fred Foo
May 5 '12 at 11:47
1
You might want to edit your question and provide better (and correct) information, or this question will likely be closed.
– Rik Poggi
May 5 '12 at 12:05