I am redaing data from csv I have a dataframe like this:
product_title variatons_color
T-shirt ['yellow','ornage']
T-shirt []
T-shirt ['blue','green']
my expected dataframe will be look like this
product_title variatons_color
T-shirt ['yellow','ornage']
T-shirt
T-shirt ['blue','green']
I want to remove empty list. How to do that in pandas?
update1
I applied Scott Boston,Ynjxsjmh and BENY solution. All solution filling None value for all of my row but I need to fill None value for only my empty list.
when I run type(df.loc[0,'variations_color'])
returning str
CodePudding user response:
You can try
df['variatons_color'] = df['variatons_color'].apply(lambda lst: lst if len(lst) else '')
print(df)
product_title variatons_color
0 T-shirt [yellow, ornage]
1 T-shirt
2 T-shirt [blue, green]
CodePudding user response:
Check assign with bool check
df.loc[~df['variatons_color'].astype(bool),'variatons_color'] = ''
Update
df.loc[df['variatons_color'].eq('[]'),'variatons_color'] = ''
CodePudding user response:
Just apply
len
:
df.loc[df['variations_color'].apply(len) == 0, 'variations_color'] = ''
or
df.loc[df['variations_color'].apply(len) == 0, 'variations_color'] = np.nan
Output:
product_title variations_color
0 T-shirt [yellow, orange]
1 T-shirt NaN
2 T-shirt [blue, green]
given df,
df = pd.DataFrame({'product_title':['T-shirt']*3,
'variations_color':[['yellow', 'orange'],[],['blue', 'green']]})
However, if your datafame structure is like this:
df = pd.DataFrame({'product_title':['T-shirt']*3,
'variations_color':['[yellow, orange]','[]','[blue, green]']})
Then, you can use the following:
df.loc[df['variations_color'] == '[]', 'variations_color'] = np.nan
Output:
product_title variations_color
0 T-shirt [yellow, orange]
1 T-shirt NaN
2 T-shirt [blue, green]
Note the difference in the first example
type(df.loc[0,'variations_color'])
returns a list
And, the second returns str. The string representation of the dataframe are identical, so you can't tell just by looking at it when printing. It is always important in python to know what kind (datatype) of the object you're working with.
CodePudding user response:
import pandas as pd
df = pd.DataFrame({'product_title':['T-shirt']*3,
'variations_color':[['yellow', 'orange'],[],['blue', 'green']]})
df['variations_color'] = df['variations_color'].apply(lambda x: None if any(eval(str(x))) == False else x)
df
CodePudding user response:
Look here!
import pandas as pd
from io import StringIO
data = '''
product_title variatons_color
T-shirt ['yellow','ornage']
T-shirt []
T-shirt ['blue','green']
'''
df = pd.read_csv(StringIO(data), delim_whitespace=True)
df.variatons_color = df.variatons_color.apply(eval)
df
'''
product_title variatons_color
0 T-shirt [yellow, ornage]
1 T-shirt []
2 T-shirt [blue, green]
'''
type(df.iat[0, 1])
# list
df.mask(df.applymap(len) == 0, None)
'''
product_title variatons_color
0 T-shirt [yellow, ornage]
1 T-shirt None
2 T-shirt [blue, green]
'''
Done!