It's a weird question - but can y'all think of a good way to just print the rows or a list of the rows and their corresponding column headers if the dataframe cell is not NaN?
Imagine a dataframe like this:
col1 col2 col3 col4
1 1 NaN 2 NaN
2 NaN NaN 1 2
3 2 NaN NaN 1
Result should look something like this:
1 [col1: 1, col3: 2]
2 [col3: 1, col4: 2]
3 [col1: 2, col4: 1]
Thanks in advance!
CodePudding user response:
You can transpose the dataframe, and for each row, drop NaNs and convert to dict:
out = df.T.apply(lambda x: dict(x.dropna().astype(int)))
Output:
>>> out
1 {'col1': 1, 'col3': 2}
2 {'col3': 1, 'col4': 2}
3 {'col1': 2, 'col4': 1}
dtype: object
CodePudding user response:
Let us try stack
df.stack().reset_index(level=0).groupby('level_0')[0].agg(dict)
Out[184]:
level_0
1 {'col1': 1.0, 'col3': 2.0}
2 {'col3': 1.0, 'col4': 2.0}
3 {'col1': 2.0, 'col4': 1.0}
Name: 0, dtype: object
CodePudding user response:
combine agg(dict) and list comprehension
d = [{k:v for k, v in x.items() if v == v } for x in df.agg(dict,1)]
[{'col1': 1.0, 'col3': 2.0},
{'col3': 1.0, 'col4': 2.0},
{'col1': 2.0, 'col4': 1.0}]