I have a pandas dataframe in the below format:
0 1 2 3
A.pkl [121,122] [123] [124,125] [126,127]
The number of columns might be more as well. In the end, I would like to merge all the values in all the columns and write it to a single column.
Result dataframe:
values
A.pkl [121,122,123,124,125,126,127]
I use the below code to generate the first part:
df = pd.DataFrame({
g: pd.read_pickle(f'{g}')['values'].tolist()
for g in groups
}).T
I tried using itertools.chain but it doesnt seem to do the trick.
Any suggestions would be appreciated.
Input dataframe:
df = pd.DataFrame({'name': ['aa.pkl'],
'0': [["001A000001", "003A0025"]],
'1': [["003B000001","003C000001"]],
'2': [["003D000001", "003E000001"]],
'3': [["003F000001", "003G000001"]]})
The above dataframe is generated in the by reading the pickle file
CodePudding user response:
Actually itertools.chain
is one way to go, but you have to do it properly:
from itertools import chain
df.apply(lambda x: list(chain(*x)), axis=1)
output:
A.pkl [121, 122, 123, 124, 125, 126, 127]
dtype: object
As @QuangHoang suggested you can also use the df.sum(axis=1)
trick, but be careful, this only works with lists. If for some reason you have numpy arrays this will perform the sum per position ([494, 497]
).
Input:
df = pd.DataFrame({'0': [[121, 122]],
'1': [[123]],
'2': [[124, 125]],
'3': [[126, 127]]})