So i have a dataset like this
UserId | CampaignSource |
---|---|
Potato | 'hello','hello','hello',None |
Carrot | 'hello','hello','hello',None |
Carrot2 | None,None,None,None |
Potato2 | 'kawai','kawai','kawai','kawai',None |
And what i want to do is basically, check if the list contains None values. And substitute every single None value by the "hello" string. But making sure, that the None filled list isnt filled.
UserId | CampaignSource |
---|---|
Potato | 'hello','hello','hello','hello' |
Carrot | 'hello','hello','hello','hello' |
Carrot2 | None,None,None,None |
Potato2 | 'kawai','kawai','kawai','kawai','kawai' |
Any takes on other ways to approach this issue? Btw couldnt display as a list bcs of weird error on stack
for lst in df_safe_l['CampaignSource']:
if None in lst:
for j in set(lst):
if j:
lst[:] = [j] * len(lst)
Mine worked but i am looking for faster alternatives
CodePudding user response:
You can try turn the list to Series and then fill the None with other values
df['CampaignSource'] = df['CampaignSource'].apply(lambda lst: pd.Series(lst).bfill().ffill().tolist())
print(df)
UserId CampaignSource
0 Potato [hello, hello, hello, hello]
1 Carrot [hello, hello, hello, hello]
2 Carrot2 [None, None, None, None]
3 Potato2 [kawai, kawai, kawai, kawai, kawai]
CodePudding user response:
First check if the list contains only None
by using None in set(x) and len(set(x)) == 1
. If yes, then you don't need to replace anything. But if it contains anything other than None
then create a new list containing the type string len(x)
times. Try using .apply()
. :
df_safe_l['CampaignSource'] = df_safe_l['CampaignSource'].apply(lambda x: x if None in set(x) and len(set(x)) == 1 else [[i for i in x if isinstance(i, str)][0]] * len(x))
Output:
userid CampaignSource
0 Potato [hello, hello, hello, hello]
1 Carrot [hello, hello, hello, hello]
2 Carrot2 [None, None, None, None]
3 Potato2 [kawai, kawai, kawai, kawai, kawai]