I have a dataframe, df, where I would like to rename two duplicate columns in consecutive order:
Data
DD Nice Nice Hello
0 1 1 2
Desired
DD Nice1 Nice2 Hello
0 1 1 2
Doing
df.rename(columns={"Name": "Name1", "Name": "Name2"})
I am running the rename
function, however, because both column names are identical, the results are not desirable.
CodePudding user response:
Here's an approach with groupby
:
s = df.columns.to_series().groupby(df.columns)
df.columns = np.where(s.transform('size')>1,
df.columns s.cumcount().add(1).astype(str),
df.columns)
Output:
DD Nice1 Nice2 Hello
0 0 1 1 2
CodePudding user response:
This is how you do it. e.g.:
df.rename(columns={ df.columns[1]: "Name1" }, inplace = True)
CodePudding user response:
You can use:
cols = pd.Series(df.columns)
dup_count = cols.value_counts()
for dup in cols[cols.duplicated()].unique():
cols[cols[cols == dup].index.values.tolist()] = [dup str(i) for i in range(1, dup_count[dup] 1)]
df.columns = cols
Input:
col_1 Nice Nice Nice Hello Hello Hello
col_2 1 2 3 4 5 6
Output:
col_1 Nice1 Nice2 Nice3 Hello1 Hello2 Hello3
col_2 1 2 3 4 5 6
Setup to generate duplicate cols:
df = pd.DataFrame(data={'col_1':['Nice', 'Nice', 'Nice', 'Hello', 'Hello', 'Hello'], 'col_2':[1,2,3,4, 5, 6]})
df = df.set_index('col_1').T
CodePudding user response:
You could use an itertools.count()
counter and a list expression to create new column headers, then assign them to the data frame.
For example:
>>> import itertools
>>> df = pd.DataFrame([[1, 2, 3]], columns=["Nice", "Nice", "Hello"])
>>> df
Nice Nice Hello
0 1 2 3
>>> count = itertools.count(1)
>>> new_cols = [f"Nice{next(count)}" if col == "Nice" else col for col in df.columns]
>>> df.columns = new_cols
>>> df
Nice1 Nice2 Hello
0 1 2 3
(Python 3.6 required for the f-strings)
EDIT: Alternatively, per the comment below, the list expression can replace any label that may contain "Nice"
in case there are unexpected spaces or other characters:
new_cols = [f"Nice{next(count)}" if "Nice" in col else col for col in df.columns]