Home > Enterprise >  pandas: replace values at specifc index in columns of list of strings
pandas: replace values at specifc index in columns of list of strings

Time:08-27

I have a dataframe as follows.

import pandas as pd
df = pd.DataFrame({"first_col":[["a","b","c","d"],["a","b"],["a","b","c"]],
               "second_col":[["the","house","is","[blue]"],["the","[weather]"]["[the]","class","today"]]})

I would like to replace values in the first column with the values in the second column if that value is in bracket so, I would like the following output.

output:

      first_col                second_col
0  [a, b, c, [blue]]          [the, house, is, [blue]]
1        [a, [weather]]       [the, [weather]]
2     [[the], b, c]           [[the], class, today]

I know how to do that for two lists as follows, but do not know how to do it for a pandas dataframe of list columns. so if i have two lists I would do,

a =["a","b","c"]
b = ["[the]","class","today"]

for index,item in enumerate(b):
    if item.endswith("]"):
    a[index] = b[index]

so printing a would return: [[the], b, c]

CodePudding user response:

You need to use a list comprehension:

df['first_col'] = [
    [b if b.startswith('[') and b.endswith(']') else a for a, b in zip(A, B)]
    for A, B in zip(df['first_col'], df['second_col'])
]

output:

           first_col                second_col
0  [a, b, c, [blue]]  [the, house, is, [blue]]
1     [a, [weather]]          [the, [weather]]
2      [[the], b, c]     [[the], class, today]
  • Related