Home > Software engineering >  Removing double quotations marks from String in Pandas Series
Removing double quotations marks from String in Pandas Series

Time:12-01

i am currently looping through a subset of a Pandas DataFrame and the string values inside have double quotation marks surrounding them. If i don't get them removed, i won't be able to compare them to what i need them compared to.

This is the code i have so far:

df_asinn = df.copy()
for index, screen_name in df.loc[:, ["username"]].iterrows():
    user_tweet_list = []
    screen_name[0] = screen_name[0].strip()
    stripped_screen_name = screen_name[0].strip()
    

The value varaible containing the string value in screen_name[0]. I have ried multiple things, with no prevail. I tried using screen_name.strip(), which did not work. I tried to use .translate('"'), however, this did also not work. It would be great if anyone had a solution to the problem.

Te goal is from having the list strings looking like this

"String1",
"String2",
"String3"

To looking like this:

String1,
String2,
String3

I know its supposed to look like that, because if i do print("String1"), the output is:

String1

CodePudding user response:

Not quite sure exactly what you're getting at: see comment by @Panda Kim. However, two things that might point you in the right direction:

  1. Calling string.replace(char, '') will replace all instances of char in string with a blank string, effectively removing them. I usually use this method, though translate works well too. It seems, though, that the proper call is translate(None, char).
  2. By calling the str method on a column (df[col_name].str), you can call any string function and it will apply that function to allelements in the column. For instance, calling df['screen name'].str.replace('\"', '') will give the same result as looping through df['screen name'] and calling replace on each element.
  • Related