I have a pandas series with string indices and integer values: (My actual series has >1000 entries)
Count | |
---|---|
apple | 1 |
bear | 2 |
cat | 3 |
Apple | 10 |
pig | 20 |
Cat | 30 |
ApPlE | 100 |
I would like to re-index to all title case.
Count | |
---|---|
Bear | 2 |
Apple | 111 |
Pig | 20 |
Cat | 33 |
The order doesn't matter.
I worked out how to do this with a dictionary and a loop.
miXedCaseSeRies = pd.Series({"apple": 1, "bear": 2 ,"cat": 3,"Apple" : 10, "pig": 20, "Cat":30 , "ApPle":100 })
tcDict ={}
for ind,val in miXedCaseSeRies.iteritems():
ind = ind.lower()
if ind in tcDict.keys():
tcDict[ind] = tcDict[ind] val
else:
tcDict[ind] = val
Titlecaseseries = pd.Series(tcDict)
...But answers like this one: How to iterate over rows in a DataFrame in Pandas (Spoiler: "DON'T") make me feel guilty for not finding a better way.
Suggestions?
CodePudding user response:
You can rename the index to first character uppercase only with string.captialized()
then groupby index and aggregate sum function
df = df.rename(index=lambda x: x.capitalize())
df = df.groupby(df.index).agg({'Count': 'sum'})
# or
df = df.groupby(df.index)['Count'].agg('sum').to_frame()
print(df)
Count
Apple 111
Bear 2
Cat 33
Pig 20
Assume you use your Series
miXedCaseSeRies = pd.Series({"apple": 1, "bear": 2 ,"cat": 3,"Apple" : 10, "pig": 20, "Cat":30 , "ApPle":100 })
miXedCaseSeRies = miXedCaseSeRies.rename(index=lambda x: x.capitalize())
df = miXedCaseSeRies.to_frame('Count').groupby(miXedCaseSeRies.index).agg({'Count': 'sum'})