I have a dataframe. I want it to filter it and reduce certain values to a string. The dataframe looks like this
Code:
data = [['42.0', 'A'], ['41.0', 'A'], ['43.0', 'B'],['43.0', 'C'], ['41.0', 'B'], ['42.0', 'B']]
df = pd.DataFrame(data, columns=['Number', 'Level'])
I tried this
df.groupby(['Number', 'Level']).size()
Got this output:
But I am looking to convert that output to a string like this
42.0(1A,1B,0C)
41.0(1A,1B,0C)
43.0(0A,1B,1C)
CodePudding user response:
Use crosstab
with DataFrame.reindex
for original order, then add columns names and join together, last create final string in generator comprehension:
df = pd.crosstab(df['Number'], df['Level']).astype(str).reindex(df['Number'].unique())
s = df.add(df.columns.to_series()).agg(','.join, axis=1)
print (s)
Number
42.0 1A,1B,0C
41.0 1A,1B,0C
43.0 0A,1B,1C
dtype: object
out = '\n'.join(f'{k}({v})' for k, v in s.items())
print (out)
42.0(1A,1B,0C)
41.0(1A,1B,0C)
43.0(0A,1B,1C)
Similar solution with nested generator for final ouput:
df = pd.crosstab(df['Number'], df['Level']).astype(str).reindex(df['Number'].unique())
cols = df.columns
out = '\n'.join(f"{i}({','.join(map(''.join, zip(x, cols)))})"
for i, x in zip(df.index, df.to_numpy()))
print (out)
42.0(1A,1B,0C)
41.0(1A,1B,0C)
43.0(0A,1B,1C)
CodePudding user response:
You can concatenate the required columns first.
df['res'] = df.index.astype(str) df['Level']
print( df.groupby(['Number', 'res']).size() )
###Number res
###41.0 1A 1
### 4B 1
###42.0 0A 1
### 5B 1
###43.0 2B 1
### 3C 1
###dtype: int64