I want to transform duplicated values of index adding a counter like -01, -02, -0n, etc. for every duplicated item, skipping unduplicated indices.
That is to transform index values from these:
one
one
two
three
three
four
to these:
one-01
one-02
two
three-01
three-02
four
What approach can I resort to?
CodePudding user response:
Use GroupBy.cumcount
for counter and assign only duplicated values by Index.duplicated
with Series.where
:
df = pd.DataFrame({'a': range(6)},
index=['one', 'one', 'two', 'three', 'three', 'four'])
df.index = (df.groupby(level=0).cumcount().add(1).astype(str).str.zfill(2).radd('-')
.where(df.index.duplicated(keep=False), ''))
print (df)
a
one-01 0
one-02 1
two 2
three-01 3
three-02 4
four 5
Or use numpy.where
:
df.index = np.where(df.index.duplicated(keep=False),
(df.index df.groupby(level=0).cumcount().add(1)
.astype(str).str.zfill(2).radd('-')),
df.index )
print (df)
a
one-01 0
one-02 1
two 2
three-01 3
three-02 4
four 5
CodePudding user response:
You can use pandas!
import pandas as pd
df = pd.DataFrame({'A': ['one', 'one', 'two', 'three', 'three', 'four']})
df['A'] = df['A'].astype('category')
df['A'] = df['A'].cat.rename_categories(df['A'].cat.categories '-' df['A'].groupby(df['A']).cumcount().astype(str).str.zfill(2))