To add double quotes around the strings in a data frame column-CodePudding

I have a data frame where in column values are having single quotes around them. As per my requirement i had to add double quotes around the string to proceed further.

For example, If I have a string in a column as ['7hag5thdu4d'], i need to add double quotes and final string should be like ["'7hag5thdu4d'"]

Following is my code and output:-

import pandas as pd
data = {"id": [1, 9, 8],
        "person": ["['Eswar']", "['john']", "['otis']"],
        "Role": ['{"manager"}', '{"analyst"}', '{"director"}']}
df = pd.DataFrame(data)
df = df.replace("'", '\'"', regex=True)
print(df)

id	person	Role
1	['"Eswar'"]	{"manager"}
9	['"john'"]	{"analyst"}
8	['"otis'"]	{"director"}

The problem i'm facing with the output is with left alignment of the characters where they've shifted their places.

Can someone help me in fixing this.........:-)

CodePudding user response：

Try this:

df = df.replace('([{\\[])[\'\"] (. ?)[\'\"] ([}\\]])', '\\1\"\'\\2\'\"\\3', regex=True)

Output:

>>> df
   id       person            Role
0   1  ["'Eswar'"]   {"'manager'"}
1   9   ["'john'"]   {"'analyst'"}
2   8   ["'otis'"]  {"'director'"}

CodePudding user response：

You aren't accounting for how sometimes you want '" and other times you want "'. You can do two replaces, one for the beginning and one for the end.

df = df.replace("\['",'["\'',regex=True)
df = df.replace("'\]",'\'"]',regex=True)

CodePudding user response：

Try replace with a dict of replacement values:

df = df.replace({r"\['": '["\'', r"'\]": '\'"]'}, regex=True)
print(df)

# Output:
   id       person          Role
0   1  ["'Eswar'"]   {"manager"}
1   9   ["'john'"]   {"analyst"}
2   8   ["'otis'"]  {"director"}

Another way: df.replace(r"'([^']*)'", """\"'\\1'\"""", regex=True)