The index of my df are strings of company names. Eg Wells Fargo
Sometimes there are excess spaces in-between the words I want to convert to only single spaces. I tried the below but got errors.
**TypeError: expected string or bytes-like object**
df.index=re.sub(' ', ' ', df.index.astype('str').str.strip())
**AttributeError: 'Index' object has no attribute 'apply'**
df.index=df.index.astype('str').str.strip().apply(lambda x: re.sub(' ', ' ', x))
Input df
| Revenue |
Wells Fargo | 1 |
Bank of American| 3 |
Desired output
| Revenue |
Wells Fargo | 1 |
Bank of American| 3 |
CodePudding user response:
df.index = df.index.str.replace(r'\s ', ' ', regex=True).str.strip()
In your first attempt, you are trying to pass a Pandas Index of strings to re.sub
, which takes a string.
apply
would work if the company names were stored as a data frame column. However as the error message says, apply
is not implemented for the index.
CodePudding user response:
Use str.split() on string and then df.rename on index. See each step below.
import pandas as pd
# making your df
d = {'index':['Wells Fargo'], 'col1':[123], 'col2':[123]}
df = pd.DataFrame(d)
df = df.set_index('index')
# get list of index strings
index_str_list = [strings for strings in df.index]
# format spaces and append to new list
new_list = []
for i in index_str_list:
s1,s2 = i.split()
s = "{:6}{:}".format(s1,s2) # set your distance
new_list.append(s)
# change index value
for old,new in zip(index_str_list, new_list):
df.rename(index={old:new}, inplace=True)
print(df)
Output:
col1 col2
index
Wells Fargo 123 123