I have a very large pandas.Dataframe
and want to create a new Dataframe by selecting all columns where one row has a specific value.
A B C D E
Region Nord Süd West Nord Nord
value 2.3 1.2 4.2 0.5 1.3
value2 20 400 30 123 200
Now i want to create a new DataFrame with all columns where the row "Region" has the value "Nord".
How can it be done? The result should look like this:
A D E
Region Nord Nord Nord
value 2.3 0.5 1.3
value2 20 123 200
Thanks in advance
CodePudding user response:
Use first DataFrame.loc
for select all rows (:
) by mask compred selected row Region
by another loc
:
df = df.loc[:, df.loc['Region'] == 'Nord']
print (df)
A D E
Region Nord Nord Nord
value 2.3 0.5 1.3
value2 20 123 200
Better is crated MultiIndex
by first row with original columns, then is possible select by DataFrame.xs
:
df.columns = [df.columns, df.iloc[0]]
df = df.iloc[1:].rename_axis((None, None), axis=1)
print (df)
A B C D E
Nord Süd West Nord Nord
value 2.3 1.2 4.2 0.5 1.3
value2 20 400 30 123 200
print (df.xs('Nord', axis=1, level=1))
A D E
value 2.3 0.5 1.3
value2 20 123 200
print (df.xs('Nord', axis=1, level=1, drop_level=False))
A D E
Nord Nord Nord
value 2.3 0.5 1.3
value2 20 123 200
CodePudding user response:
You can also use df.T
with Series.eq
:
In [312]: df.T[df.T.Region.eq('Nord')].T
Out[312]:
A D E
Region Nord Nord Nord
value 2.3 0.5 1.3
value2 20 123 200