Home > Back-end >  Reindex Pandas Series case insensitive (Combining matches)
Reindex Pandas Series case insensitive (Combining matches)

Time:05-14

I have a pandas series with string indices and integer values: (My actual series has >1000 entries)

Count
apple 1
bear 2
cat 3
Apple 10
pig 20
Cat 30
ApPlE 100

I would like to re-index to all title case.

Count
Bear 2
Apple 111
Pig 20
Cat 33

The order doesn't matter.

I worked out how to do this with a dictionary and a loop.

miXedCaseSeRies = pd.Series({"apple": 1, "bear": 2  ,"cat": 3,"Apple" : 10, "pig": 20, "Cat":30 , "ApPle":100 })
tcDict ={}
for ind,val in miXedCaseSeRies.iteritems():
    ind = ind.lower()
    if ind in tcDict.keys():
        tcDict[ind] = tcDict[ind] val 
    else:
        tcDict[ind] = val

Titlecaseseries = pd.Series(tcDict)

...But answers like this one: How to iterate over rows in a DataFrame in Pandas (Spoiler: "DON'T") make me feel guilty for not finding a better way.
Suggestions?

CodePudding user response:

You can rename the index to first character uppercase only with string.captialized() then groupby index and aggregate sum function

df = df.rename(index=lambda x: x.capitalize())
df = df.groupby(df.index).agg({'Count': 'sum'})
# or
df = df.groupby(df.index)['Count'].agg('sum').to_frame()
print(df)

       Count
Apple    111
Bear       2
Cat       33
Pig       20

Assume you use your Series

miXedCaseSeRies = pd.Series({"apple": 1, "bear": 2  ,"cat": 3,"Apple" : 10, "pig": 20, "Cat":30 , "ApPle":100 })

miXedCaseSeRies = miXedCaseSeRies.rename(index=lambda x: x.capitalize())
df = miXedCaseSeRies.to_frame('Count').groupby(miXedCaseSeRies.index).agg({'Count': 'sum'})
  • Related