Home > Net >  Getting basic stats from Np.array within a for loop in python
Getting basic stats from Np.array within a for loop in python

Time:07-10

I don't have a lot of python experience and I'm trying something rather complicated for me, so excuse my messy code. I have a few arrays that were generated with rasterio from raster layers (tif), and ultimately I want to get some basic statistics from each raster layer and append it to a data frame. I'm trying to get it as automated as possible since I have a lot of layer to go through. another obstacle was getting the column name to change according to each raster. I managed to work almost everything out, the problem is when I insert it into a for loop, instead of stats values, I get this: <built-in method values of dict object at 0x00.. would appreciate help solving that.

import rasterio
from osgeo import gdal
import numpy as np
import pandas as pd

#open all files **I have a lot of folders like that one to open
#Grifin data read
Gr_1A_hh_path = r"E:\SAOCOM\1A1B\Images\Griffin\130122\Source\Data\gtc-acqId0000705076-a-sm9-2201150146-hh-h.tif"
Gr_1A_hh = rasterio.open(Gr_1A_hh_path)

Gr_1A_vv_path = r"E:\SAOCOM\1A1B\Images\Griffin\130122\Source\Data\gtc-acqId0000705076-a-sm9-2201150146-vv-h.tif"
Gr_1A_vv = rasterio.open(Gr_1A_vv_path)

Gr_1A_vh_path = r"E:\SAOCOM\1A1B\Images\Griffin\130122\Source\Data\gtc-acqId0000705076-a-sm9-2201150146-vh-h.tif"
Gr_1A_vh = rasterio.open(Gr_1A_vh_path)

Gr_1A_hv_path = r"E:\SAOCOM\1A1B\Images\Griffin\130122\Source\Data\gtc-acqId0000705076-a-sm9-2201150146-hv-h.tif"
Gr_1A_hv = rasterio.open(Gr_1A_hv_path)

#reading all the rasters as arrays
array_1A_hh= Gr_1A_hh.read()
array_1A_vv= Gr_1A_vv.read()
array_1A_vh= Gr_1A_vh.read()
array_1A_hv= Gr_1A_hv.read()

#creating a dictionary so that each array would have a name that would be used as column name
A2 = {
   "HH":array_1A_hh,
   "VV":array_1A_vv,
   "VH":array_1A_vh,
   "HV":array_1A_hv}

df= pd.DataFrame(index=["min","max","mean","medien"])
for name, pol in A2.items():
   for band in pol:
       stats = {
       "min":band.min(),
       "max":band.max(),
       "mean":band.mean(),
       "median":np.median(band)}
       df[f"{name}"]=stats.values

OUTPUT:
df
                                                      HH  ...                                                 HV
min     <built-in method values of dict object at 0x00...  ...  <built-in method values of dict object at 0x00...
max     <built-in method values of dict object at 0x00...  ...  <built-in method values of dict object at 0x00...
mean    <built-in method values of dict object at 0x00...  ...  <built-in method values of dict object at 0x00...
medien  <built-in method values of dict object at 0x00...  ...  <built-in method values of dict object at 0x00...

CodePudding user response:

Considering you have a dict of images:

import numpy as np
import pandas as pd

vmin, vmax = 0, 255
C, H, W = 2, 64, 64

images_names = ["HH", "VV", "VH", "HV"]
images = {
    im_name: np.random.randint(vmin, vmax, size=(C, H, W))
    for im_name in images_names
}

And a bunch of functions to compute stats on a per band basis:

stats_functions = {
    "min": lambda band: band.min(),
    "max": lambda band: band.max(),
    "mean": lambda band: band.mean(),
    "median": lambda band: np.median(band),
}

You can first construct a dict of statistics:

images_stats = {
    im_name: {
        band_idx: {
            stat_name: stat_func(band)
            for stat_name, stat_func in stats_functions.items()
        }
        for band_idx, band in enumerate(im)
    }
    for im_name, im in images.items()
}

And then convert it to a pandas DataFrame:

images_stats_df = pd.concat(
    {
        im_name: pd.DataFrame(im_stats)
        for im_name, im_stats in images_stats.items()
    },
    axis="columns",
)

Which gives:

>>> images_stats_df
                HH                      VV                      VH                     HV
                 0           1           0           1           0          1           0           1
min       0.000000    0.000000    0.000000    0.000000    0.000000    0.00000    0.000000    0.000000
max     254.000000  254.000000  254.000000  254.000000  254.000000  254.00000  254.000000  254.000000
mean    127.070557  126.082764  126.483643  127.737061  127.270996  128.89502  128.814209  124.610352
median  129.000000  127.000000  126.000000  127.000000  127.000000  130.00000  129.000000  122.000000
  • Related