I am trying to interpolate a dataframe but am having no luck. I have a dataframe with a distance header and a wind component header that I am working with.
The wind components are split with a 20
unit difference and the distance by 10
. I would like to be able to interpolate to within 1
of each unit but I'm stuck.
I haven't used Scipy before this and I can't see much in the way of explanations in their docs (that I can understand).
I have a table that I converted to_dict
and use that for the dataframe:
data = {'dist': [100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420],
'-60': [520, 600, 670, 740, 810, 880, 950, 1020, 1100, 1170, 1240, 1310, 1380, 1450, 1520, 1600, 1670, 1740, 1810, 1880, 1950, 2020, 2100, 2170, 2240, 2310, 2380, 2450, 2530, 2600, 2670, 2740, 2810],
'-40': [440, 500, 570, 630, 690, 760, 820, 880, 950, 1010, 1070, 1140, 1200, 1260, 1330, 1390, 1450, 1510, 1580, 1640, 1700, 1770, 1830, 1890, 1960, 2020, 2080, 2150, 2210, 2270, 2340, 2400, 2460],
'-20': [380, 430, 490, 550, 600, 660, 720, 770, 830, 880, 940, 1000, 1050, 1110, 1170, 1220, 1280, 1340, 1390, 1450, 1510, 1560, 1620, 1680, 1730, 1790, 1850, 1900, 1960, 2020, 2070, 2130, 2190],
'0': [320, 370, 420, 480, 530, 580, 630, 680, 730, 780, 830, 890, 940, 990, 1040, 1090, 1140, 1190, 1240, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 1710, 1760, 1810, 1860, 1910, 1960],
'20': [280, 320, 370, 420, 470, 510, 560, 610, 650, 700, 750, 790, 840, 890, 930, 980, 1030, 1070, 1120, 1170, 1210, 1260, 1310, 1350, 1400, 1450, 1500, 1540, 1590, 1640, 1680, 1730, 1780],
'40': [240, 280, 330, 370, 410, 460, 500, 540, 590, 630, 670, 720, 760, 800, 840, 890, 930, 970, 1020, 1060, 1100, 1150, 1190, 1230, 1280, 1320, 1360, 1400, 1450, 1490, 1530, 1580, 1620],
'60': [210, 250, 290, 330, 370, 410, 450, 490, 530, 570, 610, 650, 690, 730, 770, 810, 850, 890, 930, 970, 1010, 1050, 1090, 1130, 1170, 1210, 1250, 1290, 1330, 1370, 1410, 1450, 1490]}
df = pd.DataFrame(data).set_index(['dist'])
df.columns = df.columns.map(float)
df.columns.name = 'wind'
print(df)
Printing this gives me:
wind -60.0 -40.0 -20.0 0.0 20.0 40.0 60.0
dist
100 520 440 380 320 280 240 210
110 600 500 430 370 320 280 250
120 670 570 490 420 370 330 290
130 740 630 550 480 420 370 330
140 810 690 600 530 470 410 370
150 880 760 660 580 510 460 410
160 950 820 720 630 560 500 450
170 1020 880 770 680 610 540 490
180 1100 950 830 730 650 590 530
190 1170 1010 880 780 700 630 570
200 1240 1070 940 830 750 670 610
210 1310 1140 1000 890 790 720 650
220 1380 1200 1050 940 840 760 690
230 1450 1260 1110 990 890 800 730
240 1520 1330 1170 1040 930 840 770
250 1600 1390 1220 1090 980 890 810
260 1670 1450 1280 1140 1030 930 850
270 1740 1510 1340 1190 1070 970 890
280 1810 1580 1390 1240 1120 1020 930
290 1880 1640 1450 1300 1170 1060 970
300 1950 1700 1510 1350 1210 1100 1010
310 2020 1770 1560 1400 1260 1150 1050
320 2100 1830 1620 1450 1310 1190 1090
330 2170 1890 1680 1500 1350 1230 1130
340 2240 1960 1730 1550 1400 1280 1170
350 2310 2020 1790 1600 1450 1320 1210
360 2380 2080 1850 1650 1500 1360 1250
370 2450 2150 1900 1710 1540 1400 1290
380 2530 2210 1960 1760 1590 1450 1330
390 2600 2270 2020 1810 1640 1490 1370
400 2670 2340 2070 1860 1680 1530 1410
410 2740 2400 2130 1910 1730 1580 1450
420 2810 2460 2190 1960 1780 1620 1490
Which is all fine so far.
Now what I'm stuck on is how to interpolate so that I can get accurate figures from it. I'm trying to use interpn
but I'm obviously doing it wrong. Here is what I'm doing to try and get an interpolated figure for a wind component of -35
and a distance of 103
:
arr = np.dstack(np.array_split(df.to_numpy(), 1))
wind = df.columns.to_numpy()
dist = df.index.get_level_values(0).unique().to_numpy()
print(interpn((wind, dist), arr, [float(-35), int(103)]))
To which I get an error of:
ValueError: There are 7 points and 33 values in dimension 0
I have tried reading through the docs but can't seem to get my head around it and all the examples I find elsewhere are for graphical data.
Can someone please help me figure this out, I'm pretty new to this kind of work. Thank you :)
CodePudding user response:
There's no need to transform your data, you already have a 2D array and can use it as-is. You got the axes wrong: the first axis (axis 0) is the rows of the dataframe, the second axis (axis 1) the columns.
arr = df.to_numpy()
dist = df.index.to_numpy()
wind = df.columns.to_numpy()
x, y = np.meshgrid(wind, dist)
print(interpn((dist, wind), arr, [103, -35]))
# array([442.25])
As an alternative, you can also use itnerp2d
, here are the axes just the other way round:
f = interp2d(wind, dist, arr)
print(f(-35, 103))
#array([442.25])