Assign a multidimensional list to a Pandas Dataframe cell-CodePudding

I have an already set Pandas Dataframe that contains an image path and I need to add a column to it where each cell should contain a multidimensional array (representing that image).

Here an example:

import pandas as pd
import numpy as np

df = pd.DataFrame(data=[["test.png","dog.png"],[3,4]], columns=["path","B"])
# creating a new empty column
df = df.assign(image=np.nan)
image = # reading image path from row 1
df.iloc[1, df.columns.get_loc("image")] = image

but I keep obtaining the error: ValueError: Must have equal len keys and value when setting with an ndarray.

How can I fix that? I've already tried to follow this but it didn't work for me.

Just to be clear, in my real dataframe the image field on n-th row depends on the value of path on n-th row.

Expected result:

   path         B  image
0  "test.png"   2    NaN
1  "dog.png"    4  [[1,2,...], [255,255,...], ...]

CodePudding user response：

Use PIL.Image module to get image object convertible to an array:

from PIL import Image

df = pd.DataFrame({'path': ["stackoverflow-icon.png", "../images/wall.jpg"],
                   'B': [3, 4]})
df['image'] = df.apply(lambda x: np.asarray(Image.open(x['path'])), axis=1)

print(df)

Sample output:

                                          path  ...                                              image
0                       stackoverflow-icon.png  ...  [[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0...
1                           ../images/wall.jpg  ...  [[[81, 127, 213], [87, 132, 213], [83, 127, 20...

[2 rows x 3 columns]