I have a dataframe like so
IsCool IsTall IsHappy Target
0 1 0 1
1 1 0 0
0 1 0 0
1 0 1 1
I want to anonymize the column names except for target. How can I do this?
Expected output:
col1 col2 col3 Target
0 1 0 1
1 1 0 0
0 1 0 0
1 0 1 1
CodePudding user response:
What about:
cols = {
col: f"col{i 1}" if col != "Target" else col
for i, col in enumerate(df.columns)
}
out = df.rename(columns=cols)
col1 col2 col3 Target
0 0 1 0 1
1 1 1 0 0
2 0 1 0 0
3 1 0 1 1
You can also do it in place:
cols = [
f"col{i 1}" if col != "Target" else col
for i, col in enumerate(df.columns)
]
df.columns = cols
CodePudding user response:
You can use:
# get all columns except excluded ones (here "Target")
cols = df.columns.difference(['Target'])
# give a new name
names = 'col' pd.Series(range(1, len(cols) 1), index=cols).astype(str)
out = df.rename(columns=names)
Output:
col1 col2 col3 Target
0 0 1 0 1
1 1 1 0 0
2 0 1 0 0
3 1 0 1 1
CodePudding user response:
Proposed code :
You can pass a dict to the rename()
Pandas function with a dict like this in parameters :
columns={'IsCool': 'col0', 'IsTall': 'col1', 'IsHappy': 'col2'}
This dict is obtained by using of a zip function : dict(zip(keys, values))
import pandas as pd
df = pd.DataFrame({"IsCool": [0, 1, 0, 1],
"IsTall": [1, 1, 1, 0],
"IsHappy": [0, 0, 0, 1],
"Target": [1, 0, 0, 1]})
df = df.rename(columns = dict(zip(df.columns.drop('Target'),
["col%s"%i for i in range(len(df.columns)-1)])))
print(df)
Result :
col0 col1 col2 Target
0 0 1 0 1
1 1 1 0 0
2 0 1 0 0
3 1 0 1 1