I have a data in one column in Python dataframe.
1-2 3-4 8-9
4-5 6-2
3-1 4-2 1-4
The need is to sum all the data available in that column.
I tried to apply below logic but it's not working for list of list.
lst=[]
str='5-7 6-1 6-3'
str2 = str.split(' ')
for ele in str2:
lst.append(ele.split('-'))
print(lst)
sum(lst)
Can anyone please let me know the simplest method ?
My expected result should be:
27
17
15
CodePudding user response:
I think we can do a split
df.col.str.split(' |-').map(lambda x : sum(int(y) for y in x))
Out[149]:
0 27
1 17
2 15
Name: col, dtype: int64
Or
pd.DataFrame(df.col.str.split(' |-').tolist()).astype(float).sum(1)
Out[156]:
0 27.0
1 17.0
2 15.0
dtype: float64
CodePudding user response:
Using pd.Series.str.extractall
:
df = pd.DataFrame({"col":['1-2 3-4 8-9', '4-5 6-2', '3-1 4-2 1-4']})
print (df["col"].str.extractall("(\d )")[0].astype(int).groupby(level=0).sum())
0 27
1 17
2 15
Name: 0, dtype: int32
CodePudding user response:
Use .str.extractall
and sum
on a level:
df['data'].str.extractall('(\d )').astype(int).sum(level=0)
Output:
0
0 27
1 17
2 15