I have a df and I want to change the name of column based on a period. For example, in the following df, I have 15 columns with name v0-14. I want to rename it to v0-v2
, and after three columns, again I want to have v0-v2
. Since, it seems that we can not have the repetitive names, I change the second group to v10-v12
, and third group to v20-v22
and etc.
df = pd.DataFrame()
df['id'] = [1]
df['v0'] = [2]
df['v1'] = [1]
df['v2'] = [2]
df['v3'] = [1]
df['v4'] = [2]
df['v5'] = [1]
df['v6'] = [2]
df['v7'] = [1]
df['v8'] = [2]
df['v9'] = [1]
df['v10'] = [2]
df['v11'] = [1]
df['v12'] = [2]
df['v13'] = [1]
df['v14'] = [2]
df
And here is the output I want. Thank you in advance
id v00 v01 v02 v10 v11 v12 v20 v21 v22 v30 v31 v32 v40 v41 v42
0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
CodePudding user response:
You can actually have duplicated names:
import numpy as np
df.columns = np.hstack(['id', np.tile(['v0', 'v1', 'v2'], (len(df.columns)-1)//3)])
print(df)
Output:
id v0 v1 v2 v0 v1 v2 v0 v1 v2 v0 v1 v2 v0 v1 v2
0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
For your alternative:
a = np.arange(df.shape[1]-1)
new = np.core.defchararray.add('v', (a%3 a//3*10).astype(str))
df.columns = np.hstack(['id', new])
print(df)
Output:
id v0 v1 v2 v10 v11 v12 v20 v21 v22 v30 v31 v32 v40 v41 v42
0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
alternative with str.replace
:
df.columns = df.columns.str.replace('(\d )',
lambda m: str((x:=int(m.group()))%3 (x//3*10)),
regex=True)
CodePudding user response:
You can use:
df.rename(columns={f"v{i}": f"v{int(i/3)}" f"{int(i%3)}" for i in range(len(df.columns))}, inplace=True)