I have a 3D numpy array of shape (7,100,50) that represents a stack of 7 100x50 images.
I want to convert this array to a dataframe containing the position of all pixels x,y,z and the value of the pixel (id)
I have managed to do this for a single image (no z):
import numpy as np
import pandas as pd
img = np.random.randint(0,30,size=(100,50))
cell_id = img.flatten()
x = [i % img.shape[1] for i in range(len(cell_id))]
y = [y_ for y_ in range(img.shape[1]) for _ in range(img.shape[0])]
df = pd.DataFrame(data={"id":cell_id, "x":x, "y":y, "z":0})
df:
id x y z
0 29 0 0 0
1 16 1 0 0
2 3 2 0 0
3 15 3 0 0
4 23 4 0 0
... ... ... ... ...
4995 7 45 49 0
4996 6 46 49 0
4997 1 47 49 0
4998 5 48 49 0
4999 7 49 49 0
5000 rows × 4 columns
How do I adjust this to work for
zimg = np.random.randint(0,30,size=(7,100,50))
?
CodePudding user response:
I see you mentioned np.ndenumerate
in another comment, this should do the trick:
import pandas as pd
import numpy as np
def constructor(array, z=0):
"""Transform an array into df
Here we assume z=0 as in your example
"""
for (img_id, y, x), value in np.ndenumerate(array):
yield (img_id, value, x, y, z)
a = np.random.randint(0,30,size=(7,100,50))
df = pd.DataFrame(
constructor(a),
columns=('image_id', 'id', 'x', 'y', 'z')
)
CodePudding user response:
import numpy as np
import pandas as pd
img = np.random.randn(7,100,50) # (z,x,y)
mapping = {
'x': [],
'y': [],
'z': [],
'id': [],
}
for z in range(7):
for x in range(100):
for y in range(50):
mapping['x'].append(x)
mapping['y'].append(y)
mapping['z'].append(z)
mapping['id'].append(img[z][x][y])
df = pd.DataFrame.from_dict(mapping)
df.head()
Or you could do what you just did 7 times, z values would change and just concatenate each table using pd.concat
CodePudding user response:
I will take same shape but smaller size of (2,10,5) so that output is easy to interpret and also code can run on online compiler where I am validating. You can give original size of (7,100,50).
import numpy as np
import pandas as pd
x = np.random.randint(0,30,size=(2,10,5))
x[x==0] = -1 #replace 0 with -1 to use np.nonzero
val = np.transpose(np.nonzero(x)) #get pixel indices as 2d array
id = x[np.nonzero(x)] #get pixels as 1d array
df = pd.DataFrame.from_records(val) #create df
df = df.set_index(id) #set new index
df.columns = ['x','y', 'z'] #set column names
df.index.name = 'id' #set index column name
df = df.reset_index() #reset index to get id as column
df = df.clip(lower=0) #replace -1 in id with 0
print(df.head(100))
Output:
id x y z
0 6 0 0 0
1 17 0 0 1
2 19 0 0 2
3 26 0 0 3
4 12 0 0 4
.. .. .. .. ..
95 16 1 9 0
96 26 1 9 1
97 8 1 9 2
98 5 1 9 3
99 13 1 9 4