I have a dataframe. I want it to filter it and reduce certain values to a string. The dataframe looks like this
EXPECTED OUTPUT
42.0(1A,1B,0C)
41.0(1A,1B,0C)
43.0(0A,1B,0C)
45.0(1A,1B,0C)
Code:
data = [['42.0', 'A'], ['41.0', 'A'], ['43.0', 'B'], ['41.0', 'B'], ['42.0', 'B'],['45.0', 'B'],['45.0', 'A']]
df = pd.DataFrame(data, columns=['Number', 'Level'])
df = pd.crosstab(df['Number'], df['Level']).astype(str).reindex(df['Number'].unique())
s = df.add(df.columns.to_series()).agg(','.join, axis=1)
out = '\n'.join(f'{k}({v})' for k, v in s.items())
print (out)
MY OUTPUT
42.0(1A,1B)
41.0(1A,1B)
43.0(0A,1B)
45.0(1A,1B)
The solution should assume that there are three categories in the level column. The categories are A,B and C. But in the data not all the three categories maybe present. In that case it should fill in as 0 as in the below output for C category though the data didnt have C category.
I know that's a tough thing kindly help me out if possible.
Note: I tried my best to clearly explain this question. Please don't downvote me instead ask incase if u have any doubts
CodePudding user response:
Doing the reindex
with columns
df = pd.crosstab(df['Number'], df['Level']).astype(str).reindex(columns = list('ABC'),fill_value=0)
s = df.astype(str).add(df.columns.to_series()).agg(','.join, axis=1)
out = '\n'.join(f'{k}({v})' for k, v in s.items())
print (out)
41.0(1A,1B,0C)
42.0(1A,1B,0C)
43.0(0A,1B,0C)
45.0(1A,1B,0C)