Home > Mobile >  Python - Dataframe from an info_dict
Python - Dataframe from an info_dict

Time:07-27

I have a python script that can pull metadata from a photos. The script works and I have everything go into one column of a dataframe however I do not know how to put each piece of data in it's own column.

Here is my script:

for file in os.listdir(rootdir):
try:
    # read the image data using PIL
    image = Image.open(os.path.join(rootdir, file))

    # extract other basic metadata
    info_dict = {
        "FileName": os.path.basename(image.filename),
        "FileSize": os.path.getsize(image.filename),
        "FilePath": pathlib.Path(image.filename).suffix,
        "DPI": image.info['dpi'][0],
        "Height": image.height,
        "Width": image.width,
        "Format": image.format,
        "Mode": image.mode,
        "Frames": getattr(image, "n_frames", 1)
    }
    line = ""
    for label in range (1):
        line  = r"\'{str(label)}\'"
       
        line = ",".join([str(val) for val in info_dict.values()])

        dpi = []
        dpi.append(line)
        DPIDf = pd.DataFrame(dpi)
        DPIDf.columns = ['FileName']
        print(DPIDf)

I have everything going to one column in the dataframe as of now. What I would like to know is how to get and append each piece of data in info_dict into it's own column in the dataframe.

I know that DPIDf.columns needs to be DPIDf.columns = ['FileName','FileSize','DPI','Height','Width', 'Format','Mode','Frames']

I just need to know how to get each piece of data into its own column

CodePudding user response:

 info_list = []
 for file in os.listdir(rootdir):
    try:
       # read the image data using PIL
       image = Image.open(os.path.join(rootdir, file))

       # extract other basic metadata
       info_list.append([
          os.path.basename(image.filename),
          os.path.getsize(image.filename),
          pathlib.Path(image.filename).suffix,
          image.info['dpi'][0],
          image.height,
          image.width,
          image.format,
          image.mode,
          getattr(image, "n_frames", 1)
       ])
     except:
        pass
 DPIDf = pd.DataFrame(info_list, columns=["FileName", "FileSize", "FilePath", "DPI", "Height", "Width", "Format", "Mode", "Frames"])

Is this you are looking for?

  • Related