I am trying to create a condition where if the column headers in my dataframe are equal to Unnamed: 0 VALUE VALUE.1 VALUE.2 then i want to do drop the first two rows and rename the headers
Unnamed: 0 VALUE VALUE.1 VALUE.2
Name Hobbies Dislikes Favorite Color
Ben NaN NaN NaN
Alex NaN Running Red
Mike NaN Cartoons Blue
Mark NaN Pizza Yellow
I know i can do
df = df.drop([0,1])
but i need it to be conditional
I tried doing
if df.columns = {"Unnamed: 0", "VALUE", "VALUE.1", "VALUE.2"}:
df = df.drop([0,1])
df = df.rename(columns={"Unnamed: 0": "Name", "VALUE": "Hobbies", "VALUE.1": "Dislikes", "VALUE.2": "Favorite Color"})
but i'm running into a syntax error where i am trying to create a condition with my column names. Any clue how to fix this?
CodePudding user response:
try this:
cols = pd.Index(['Unnamed:0', 'VALUE', 'VALUE.1', 'VALUE.2'])
if df.columns.equals(cols):
df = df.set_axis(df.iloc[0], axis=1).iloc[1:]
print(df)
>>>
Name Hobbies Dislikes Favorite Color
1 Ben NaN NaN NaN None
2 Alex NaN Running Red None
3 Mike NaN Cartoons Blue None
4 Mark NaN Pizza Yellow None
CodePudding user response:
Firstly, you only need to drop row 0 cause columns is not a row.
Then the == should be used in the if statement, and it's a list comparison, so add .all()
import pandas as pd
df = pd.DataFrame(columns=["Unnamed: 0", "VALUE", "VALUE.1", "VALUE.2"])
df.loc[0] = ['Name', 'Hobbies', 'Dislikes', 'Favorite Color']
df.loc[1] = ['Ben', None, None, None]
print(df)
if (df.columns == ["Unnamed: 0", "VALUE", "VALUE.1", "VALUE.2"]).all():
df = df.drop([0])
df.columns = ['Name', 'Hobbies', 'Dislikes', 'Favorite Color']
print()
print(df)
output:
Unnamed: 0 VALUE VALUE.1 VALUE.2
0 Name Hobbies Dislikes Favorite Color
1 Ben None None None
Name Hobbies Dislikes Favorite Color
1 Ben None None None
CodePudding user response:
There is a ':' at the end of the first line. and the indentation
apart from that the code shouldn't give any syntax error