Rename columns based on relatation between other columns-CodePudding

I'm struggling on finding a solution for this situation.

I have two pandas dataframes one have columns named with their ids and the other one has the correlation between the ids, real name and sector:

df1 =

id1	id2	id3	randomcol	id4	id5	id6	id7	...
1	2	3	a	4	5	6	7	...
7	6	5	b	4	3	2	1	...
1	2	3	c	4	5	6	7	...

And the correlation between ids, real names and sectors:

df2 =

real names	id	sector
air	1	5
earth	2	10
fire	3	15
water	4	20
wind	5	25

I want to rename the df1 columns with their real names from df2.

I can do df1.rename(columns={"id1": "air", "id2": "earth", "id3": "fire",...}, errors="raise")

But it have around 20 different sectors so it would be a bit frustating.

Is there a better way of doing this?

Thanks in advance.

CodePudding user response：

You can create a mapping dict (Series) from df2 and rename columns of df1. The trick here is to convert numeric ids (1, 2, 3, ...) to their equivalent string (id1, id2, id3, ...):

MAPPING = df2.assign(id='id' df2['id'].astype(str)).set_index('id')['real names']
df1 = df1.rename(columns=MAPPING)

Output:

>>> df1
   air  earth  fire randomcol  water  wind  id6  id7
0    1      2     3         a      4     5    6    7
1    7      6     5         b      4     3    2    1
2    1      2     3         c      4     5    6    7

>>> MAPPING
id
id1      air
id2    earth
id3     fire
id4    water
id5     wind
Name: real names, dtype: object

CodePudding user response：

You can make a mapping given df2["id"] and df2["real names"] and then replace df.columns:

map = {f"id{i}": name for i, name in zip(df2["id"], df2["real names"])}
df1.columns = df1.columns.to_series().replace(map)

Output:

   air  earth  fire  randomcol  water  wind  id6  id7
0  1    2      3     a          4      5     6    7
1  7    6      5     b          4      3     2    1
2  1    2      3     c          4      5     6    7