I have a data frame where in column values are having single quotes around them. As per my requirement i had to add double quotes around the string to proceed further.
For example, If I have a string in a column as ['7hag5thdu4d'], i need to add double quotes and final string should be like ["'7hag5thdu4d'"]
Following is my code and output:-
import pandas as pd
data = {"id": [1, 9, 8],
"person": ["['Eswar']", "['john']", "['otis']"],
"Role": ['{"manager"}', '{"analyst"}', '{"director"}']}
df = pd.DataFrame(data)
df = df.replace("'", '\'"', regex=True)
print(df)
id | person | Role |
---|---|---|
1 | ['"Eswar'"] | {"manager"} |
9 | ['"john'"] | {"analyst"} |
8 | ['"otis'"] | {"director"} |
The problem i'm facing with the output is with left alignment of the characters where they've shifted their places.
Can someone help me in fixing this.........:-)
CodePudding user response:
Try this:
df = df.replace('([{\\[])[\'\"] (. ?)[\'\"] ([}\\]])', '\\1\"\'\\2\'\"\\3', regex=True)
Output:
>>> df
id person Role
0 1 ["'Eswar'"] {"'manager'"}
1 9 ["'john'"] {"'analyst'"}
2 8 ["'otis'"] {"'director'"}
CodePudding user response:
You aren't accounting for how sometimes you want '"
and other times you want "'
. You can do two replaces, one for the beginning and one for the end.
df = df.replace("\['",'["\'',regex=True)
df = df.replace("'\]",'\'"]',regex=True)
CodePudding user response:
Try replace
with a dict of replacement values:
df = df.replace({r"\['": '["\'', r"'\]": '\'"]'}, regex=True)
print(df)
# Output:
id person Role
0 1 ["'Eswar'"] {"manager"}
1 9 ["'john'"] {"analyst"}
2 8 ["'otis'"] {"director"}
Another way: df.replace(r"'([^']*)'", """\"'\\1'\"""", regex=True)