How to create a list of DICOM files and convert it to a single numpy array .npy?-CodePudding

I have a problem and don't know how to solve: I'm learning how to analyze DICOM files with Python and, so,

I got a patient exam, on single patient and one single exam, which is 200 DICOM files all of the size 512x512 each archive representing a different layer of him and I want to turn them into a single archive .npy so I can use in another tutorial that I found online.

Many tutorials try to convert them to jpg or png using opencv first, but I don't want this since I'm not interested in a friendly image to see right now, I need the array. Also, this step screw all the quality of images.

I already know that using:

medical_image = pydicom.read_file(file_path)
image = medical_image.pixel_array

I can grab the path, turn 1 slice in a pixel array and them use it, but the thing is, it doesn't work in a for loop.

The for loop I tried was basically this:

image = [] #  to create an empty list

for f in glob.iglob('file_path'):
    img = pydicom.dcmread(f)
    image.append(img)

It results in a list with all the files. Until here it goes well, but it seems it's not the right way, because I can use the list and can't find the supposed next steps anywhere, not even answers to the errors that I get in this part, (so I concluded it was wrong)

CodePudding user response：

The following code snippet allows to read DICOM files from a folder dir_path and to store them into a list. Actually, the list does not consist of the raw DICOM files, but is filled with NumPy arrays of Hounsfield units (by using the apply_modality_lut function).

import os
from pathlib import Path
import pydicom
from pydicom.pixel_data_handlers import apply_modality_lut

dir_path = r"path\to\dicom\files"

dicom_set = []
for root, _, filenames in os.walk(dir_path):
    for filename in filenames:
        dcm_path = Path(root, filename)
        if dcm_path.suffix == ".dcm":
            try:
                dicom = pydicom.dcmread(dcm_path, force=True)
            except IOError as e:
                print(f"Can't import {dcm_path.stem}")
            else:
                hu = apply_modality_lut(dicom.pixel_array, dicom)
                dicom_set.append(hu)

CodePudding user response：

You were well on your way. You just have to build up a volume from the individual slices that you read in. This code snippet will create a pixelVolume of dimension 512x512x200 if your data is as advertised.

import dicom
import numpy

images = [] #  to create an empty list

# Read all of the DICOM images from file_path into list "images"
for f in glob.iglob('file_path'):
    image = pydicom.dcmread(f)
    images.append(image)


# Use the first image to determine the number of rows and columns
repImage = images[0]
rows=int(repImage.Rows)
cols=int(repImage.Columns)
slices=len(images)

# This tuple represents the dimensions of the pixel volume
volumeDims = (rows, cols, slices)

# allocate storage for the pixel volume
pixelVolume = numpy.zeros(volumeDims, dtype=repImage.pixel_array.dtype)

# fill in the pixel volume one slice at a time
for image in images:
    pixelVolume[:,:,i] = image.pixel_array

#Use pixelVolume to do something interesting

I don't know if you are a DICOM expert or a DICOM novice, but I am just accepting your claim that your 200 images make sense when interpreted as a volume. There are many ways that this may fail. The slices may not be in expected order. There may be multiple series in your study. But I am guessing you have a "nice" DICOM dataset, maybe used for tutorials, and that this code will help you take a step forward.