I'm using this function to clean up my columns. However, somehow I'm deleting numbers, which I don't want to do. So for example here, when applied, I get: "standard_access_requested_application_rolegroup_ld_e"
Any help would be great. Thanks.
def text_replacement(x):
"""
This function formats the field names so that they are more SQL friendly
"""
for key, value in custom_fields_dict.items():
pattern = re.compile(key, re.IGNORECASE)
x = pattern.sub(value, x).lower().replace('fields.','').replace(' ','_').replace('™','')
x = re.sub(r"[()\[\]&^%$#@!-:'\/]",'',x)
return x
text_replacement("standard_access_requested_application:_'role/group': ]ld_10706(e)™")
The application of the function:
#Replace the columns in the dataframe
new_columns = []
for i in df.columns:
new_columns.append(text_replacement(i))
df.columns = new_columns
CodePudding user response:
The !-:
part of your pattern represents a character in the range between !
and :
, which apparently includes digits.
from regex101.com, characters coloured blue are a match to the orange pattern
If you put escape characters \
before !
, -
and :
, this would work:
x = re.sub(r"[()\[\]&^%$#@\!\-\:'\/]", '', x)
[1]: 'standard_access_requested_application_rolegroup ld_10706e'
I only used the regex pattern, but not the rest of your code, so your own result may vary, but the digits would be saved.