Home > Blockchain >  Loading images from their respective names mentioned in csv column
Loading images from their respective names mentioned in csv column

Time:07-13

I have dataset of images and it's corresponding csv files (converted to dataframe) containing names and other information of these images. The actual number of images are about 7000 but after pre-processing the dataframe, I have left just 3000 image names in this dataframe. Now I want to load only those images which are available in the dataframe only.

The image names in dataframe are like below

|    images     |
1_IM-0001-4001.dcm.png
2_IM-0001-4001.dcm.png
3_IM-0001-4001.dcm.png

but the full path of these images are like below including directory path which is also called absolute path

/content/ChestXR/images/images_normalized/1004_IM-0005-1001.dcm.png

Now I want to run a loop that read images from the dataframe column only for this I need absolute path plus image names mentioned in dataframe column

for images in os.listdir(path):
    if (images.endswith(".png") or images.endswith(".jpg") or images.endswith(".jpeg")):
    image_path = path   {df["images"]}

where image directory path is below

 path = "/content/drive/MyDrive/IU-Xray/images/images_normalized"

and the respective data frame column name is below

 df["images"]

but the below line does not work in my loop and generates error that "TypeError: unhashable type: 'Series'"

 image_path = path   {df["images"]}

CodePudding user response:

This may not be the fullest answer but I think will get you close..

path = "/content/drive/MyDrive/IU-Xray/images/images_normalized"
filestodownload = []

for images in df["images"]:
    if (images.endswith(".png") or images.endswith(".jpg") or images.endswith(".jpeg")):
        filestodownload.append(path   '//'   images)

Then you'll have a list of images you need to download from etc.

You may have to check if df["images"] will work to iterate through, you can turn that column into a list as well if that's easier

CodePudding user response:

as you are using pandas you can do something like this:

path = "/content/drive/MyDrive/IU-Xray/images/images_normalized/"

mask = df['images'].str.contains(r'\.(?:png|jpg|jpeg)$')
full_path = path   df.images[mask]

print(full_path[1])
# /content/drive/MyDrive/IU-Xray/images/images_normalized/2_IM-0001-4001.dcm.png
  • Related