Home > database >  Pandas: Select columns based on row values
Pandas: Select columns based on row values

Time:11-11

I have a very large pandas.Dataframe and want to create a new Dataframe by selecting all columns where one row has a specific value.

        A     B     C     D    E 
Region  Nord  Süd   West  Nord Nord
value   2.3   1.2   4.2   0.5  1.3
value2  20    400   30    123  200

Now i want to create a new DataFrame with all columns where the row "Region" has the value "Nord".

How can it be done? The result should look like this:

        A     D    E
Region  Nord  Nord Nord
value   2.3   0.5  1.3
value2  20    123  200

Thanks in advance

CodePudding user response:

Use first DataFrame.loc for select all rows (:) by mask compred selected row Region by another loc:

df = df.loc[:, df.loc['Region'] == 'Nord']
print (df)
           A     D     E
Region  Nord  Nord  Nord
value    2.3   0.5   1.3
value2    20   123   200

Better is crated MultiIndex by first row with original columns, then is possible select by DataFrame.xs:

df.columns = [df.columns, df.iloc[0]]
df = df.iloc[1:].rename_axis((None, None), axis=1)
print (df)
          A    B    C    D    E
       Nord  Süd West Nord Nord
value   2.3  1.2  4.2  0.5  1.3
value2   20  400   30  123  200

print (df.xs('Nord', axis=1, level=1))
          A    D    E
value   2.3  0.5  1.3
value2   20  123  200


print (df.xs('Nord', axis=1, level=1, drop_level=False))
          A    D    E
       Nord Nord Nord
value   2.3  0.5  1.3
value2   20  123  200

CodePudding user response:

You can also use df.T with Series.eq:

In [312]: df.T[df.T.Region.eq('Nord')].T
Out[312]: 
           A     D     E
Region  Nord  Nord  Nord
value    2.3   0.5   1.3
value2    20   123   200
  • Related