I have a DataFrame like this:
timestamp asks
2022-01-01 00:00:00 [{'price':'0.99', 'size':'12311'},{'price':'0.991', 'size':'20013'}]
2022-01-01 01:00:00 [{'price':'0.99', 'size':'3122'},{'price':'0.991', 'size':'43221'}]
...
What I want to do is sum up the values of size
for eachtimestamp
to get the following DataFrame:
timestamp asks
2022-01-01 00:00:00 32324
2022-01-01 01:00:00 46343
...
i.e. 12311 20013= 32324
.
How can this be done (using pandas ideally)?
CodePudding user response:
df["asks"] = df["asks"].explode().str["size"].astype(int).groupby(level=0).sum()
- get each dictionary on a separate row
- get the "size" key's value for each of them
- due to ducktyping,
.str[...]
works on anything that supports__getitem__
which a dictionary does
- due to ducktyping,
- convert to integers and unexplode
to get
>>> df
timestamp asks
0 2022-01-01 00:00:00 32324
1 2022-01-01 01:00:00 46343