Home > OS >  Complex datarame filtering python pandas
Complex datarame filtering python pandas

Time:11-10

I have a dataframe. I want it to filter it and reduce certain values to a string. The dataframe looks like this

Dataframeimage

Code:

data = [['42.0', 'A'], ['41.0', 'A'], ['43.0', 'B'],['43.0', 'C'], ['41.0', 'B'], ['42.0', 'B']]
df = pd.DataFrame(data, columns=['Number', 'Level'])

I tried this

df.groupby(['Number', 'Level']).size()

Got this output:

output

But I am looking to convert that output to a string like this

42.0(1A,1B,0C)
41.0(1A,1B,0C)
43.0(0A,1B,1C)

CodePudding user response:

Use crosstab with DataFrame.reindex for original order, then add columns names and join together, last create final string in generator comprehension:

df = pd.crosstab(df['Number'], df['Level']).astype(str).reindex(df['Number'].unique())
s = df.add(df.columns.to_series()).agg(','.join, axis=1)
print (s)
Number
42.0    1A,1B,0C
41.0    1A,1B,0C
43.0    0A,1B,1C
dtype: object

out = '\n'.join(f'{k}({v})' for k, v in s.items())
print (out)
42.0(1A,1B,0C)
41.0(1A,1B,0C)
43.0(0A,1B,1C)

Similar solution with nested generator for final ouput:

df = pd.crosstab(df['Number'], df['Level']).astype(str).reindex(df['Number'].unique())

cols = df.columns
out = '\n'.join(f"{i}({','.join(map(''.join, zip(x, cols)))})" 
              for i, x in zip(df.index, df.to_numpy()))
print (out)

42.0(1A,1B,0C)
41.0(1A,1B,0C)
43.0(0A,1B,1C)

CodePudding user response:

You can concatenate the required columns first.

df['res'] = df.index.astype(str)   df['Level']
print( df.groupby(['Number', 'res']).size() )

###Number  res
###41.0    1A     1
###        4B     1
###42.0    0A     1
###        5B     1
###43.0    2B     1
###        3C     1
###dtype: int64
  • Related