Home > Back-end >  why pandas explode method not working in my dataframe?
why pandas explode method not working in my dataframe?

Time:06-01

here is my code:

df = pd.read_csv('my_path\\zzounds.csv')
df.head() 

      variation_type       main_image
  ['yellow', 'orange']   ['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']

I tried this code

df.explode(['variation_type','main_image'])

But it's returning the original dataframe.

CodePudding user response:

I believe its because python struggles to explode multiple columns in this manner.

You can use this code to get the results I believe you are expecting

data = {
    ' variation_type' : [['yellow', 'orange']],
    'main_image' : [['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]
}
df = pd.DataFrame(data)
df.apply(pd.Series.explode)

***Note: This will only work if all "list" fields are the same length

CodePudding user response:

Just to narrow down the issue, note that the dataframe displayed in your question does indeed work with explode(). If your values are strings that look like lists, then as suggested by @Ynjxsjmh, it may be necessary to convert them to list values first.

Sample test code:

import pandas as pd
df = pd.DataFrame({
'variation_type':[['yellow', 'orange']],
'main_image':[['https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg', 'https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg']]})
print(df.to_string())

df = df.explode(['variation_type','main_image'])
print(df.to_string())

Input:

     variation_type                                                                                                                                                                                                                                                                  main_image
0  [yellow, orange]  [https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg, https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg]

Output:

  variation_type                                                                                                                           main_image
0         yellow  https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg
0         orange  https://c1.zzounds.com/media/productmedia/fit,600by600/quality,85/GIGBAR_MOVE_ON_TRIPOD_812129-0eddea01276623f9a03cbfdd86eb3bd1.jpg

CodePudding user response:

Since you are reading from csv file, you need to convert string list first

df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: pd.eval(x, local_dict={'nan': np.nan}))
# or
df[['variation_type','main_image']] = df[['variation_type','main_image']].applymap(lambda x: eval(x, {'nan': np.nan}))

df = df.explode(['variation_type','main_image'])
  • Related