Home > Software design >  Append values to lists after raster sampling, in a loop
Append values to lists after raster sampling, in a loop

Time:12-03

I have multiple rasters in a specific directory from which I need to extract band1 values (chlorophyll concentration) using a CSV containg the coordinates of the points of interest.

This is the CSV (read as GeoDataFrame):

point_id         point_name  latitude  longitude                   geometry
0         1  'Forte dei Marmi'   10.2427    43.5703  POINT (10.24270 43.57030)
1         2        'La Spezia'    9.9030    44.0341   POINT (9.90300 44.03410)
2         3        'Orbetello'   11.2029    42.4488  POINT (11.20290 42.44880)
3         4     'Portoferraio'   10.3328    42.8080  POINT (10.33280 42.80800)
4         5          'Fregene'   12.1990    41.7080  POINT (12.19900 41.70800)

All the rasters I need to sample are in raster_dir = 'C:/sentinel_3_processing/' My final purpose is to have a dataframe with as much columns as raster in the folder.

The samlpling of all the rasters is working, the output is correct but I need it to be different. As I explained before.

The output I got is:

[[10.2427, 43.5703, 0.63],
 [10.2427, 43.5703, 0.94],
 [10.2427, 43.5703, 0.76],
 [10.2427, 43.5703, 0.76],
 [10.2427, 43.5703, 1.03],
 [10.2427, 43.5703, 0.86],
 [10.2427, 43.5703, 0.74],
 [10.2427, 43.5703, 1.71],
 [10.2427, 43.5703, 3.07],,
 [...],
 [12.199, 41.708, 0.96],
 [12.199, 41.708, 0.89],
 [12.199, 41.708, 1.29],
 [12.199, 41.708, 0.24],
 [12.199, 41.708, 1.59],
 [12.199, 41.708, 1.78],
 [12.199, 41.708, 0.39],
 [12.199, 41.708, 1.54],
 [12.199, 41.708, 1.62]]

But I need something like that:

[
[10.2427, 43.5703, 0.63, 0.94, 0.76, 0.76, 1.03, 0.86, 0.74, 1.71, 3.07],
[...],
[12.199, 41.708, 0.96, 0.89, 1.29, 0.24, 1.59, 1.78, 0.39, 1.54, 1.62]]
]

Now I'll show you the code I wrote:

L = [] # final list that contains the other lists
for p in csv_gdf['geometry']: # for all the point contained in the dataframe...
    for files in os.listdir(raster_dir): #...and for all the rasters in that folder...
        if files[-4:] == '.img': #...which extention is .img...
            r = rio.open(raster_dir   '\\'   files) # open the raster
            list_row = [] 
            # read the raster band1 values at those coordinates...
            x = p.xy[0][0] 
            y = p.xy[1][0]
            row, col = r.index(x, y)
            chl_value = r.read(1)[row, col]
            
            # append to list_row the coordinates ad then the raster value.
            list_row.append(p.xy[0][0])
            list_row.append(p.xy[1][0])
            list_row.append(round(float(chl_value), 2))
            # then, append all the lists created in the loop to the final list
            L.append(list_row)

Could you please help me? Every piece of advice is widely appreciated! Thank you in advance! Hope your guys are ok!

CodePudding user response:

Try this,

data = [[10.2427, 43.5703, 0.63],
 [10.2427, 43.5703, 0.94],
 [10.2427, 43.5703, 0.76],
 [10.2427, 43.5703, 0.76],
 [10.2427, 43.5703, 1.03],
 [10.2427, 43.5703, 0.86],
 [10.2427, 43.5703, 0.74],
 [10.2427, 43.5703, 1.71],
 [10.2427, 43.5703, 3.07],
 [12.199, 41.708, 0.96],
 [12.199, 41.708, 0.89],
 [12.199, 41.708, 1.29],
 [12.199, 41.708, 0.24],
 [12.199, 41.708, 1.59],
 [12.199, 41.708, 1.78],
 [12.199, 41.708, 0.39],
 [12.199, 41.708, 1.54],
 [12.199, 41.708, 1.62]]
 
df = pd.DataFrame(data)

print(df.groupby([0, 1])[2].apply(list).reset_index().apply(lambda x: [x[0], x[1]] x[2], axis=1).values.tolist())

Explanation:

  1. Create dataframe out of your current output
  2. groupby first two cols and get other elements as list
  3. Restructure to get the expected output

O/P:

[[10.2427, 43.5703, 0.63, 0.94, 0.76, 0.76, 1.03, 0.86, 0.74, 1.71, 3.07], [12.199, 41.708, 0.96, 0.89, 1.29, 0.24, 1.59, 1.78, 0.39, 1.54, 1.62]]

Note: The above code is just to give you an idea, it can be further improved. If I get some free time, I will post that as well.

  • Related