I have a table:
-60 -40 -20 0 20 40 60
100 520 440 380 320 280 240 210
110 600 500 430 370 320 280 250
120 670 570 490 420 370 330 290
130 740 630 550 480 420 370 330
140 810 690 600 530 470 410 370
The headers along the top are a wind vector and the first col on the left is a distance. The actual data in the 'body' of the table is just a fuel additive.
I am very new to Pandas and Numpy so please excuse the simplicity of the question. What I would like to know is, how can I enter the table using the headers to retrieve one number? I have seen its possible using indexes, but I don't want to use that method if I don't have to.
for example:
I have a wind unit of -60
and a distance of 120
so I need to retrieve the number 670
. How can I use Numpy or Pandas to do this?
Also, if I have a wind unit of say -50
and a distance of 125
, is it then possible to interpolate these in a simple way?
EDIT:
Here is what I've tried so far:
import pandas as pd
df = pd.read_table('fuel_adjustment.txt', delim_whitespace=True, header=0,index_col=0)
print(df.loc[120, -60])
But get the error:
line 3083, in get_loc raise KeyError(key) from err
KeyError: -60
CodePudding user response:
You can select any cell from existing indices using:
df.loc[120,-60]
The type of the indices needs however to be integer. If not, you can fix it using:
df.index = df.index.map(int)
df.columns = df.columns.map(int)
For interpolation, you need to add the empty new rows/columns using reindex
, then apply interpolate
on each dimension.
(df.reindex(index=sorted(df.index.to_list() [125]),
columns=sorted(df. columns.to_list() [-50]))
.interpolate(axis=1, method='index')
.interpolate(method='index')
)
Output:
-60 -50 -40 -20 0 20 40 60
100 520.0 480.0 440.0 380.0 320.0 280.0 240.0 210.0
110 600.0 550.0 500.0 430.0 370.0 320.0 280.0 250.0
120 670.0 620.0 570.0 490.0 420.0 370.0 330.0 290.0
125 705.0 652.5 600.0 520.0 450.0 395.0 350.0 310.0
130 740.0 685.0 630.0 550.0 480.0 420.0 370.0 330.0
140 810.0 750.0 690.0 600.0 530.0 470.0 410.0 370.0
CodePudding user response:
You can simply use df.loc for that purpose
df.loc[120,-60]
CodePudding user response:
You need to check the data type of index and column. That should be the reason why you failed df.loc[120,-60]
.
Try:
df.loc[120, "-60"]
To validate the data type, you may call:
>>> df.index
Int64Index([100, 110, 120, 130, 140], dtype='int64')
>>> df.columns
Index(['-60', '-40', '-20', '0', '20', '40', '60'], dtype='object')
If you want to turn the header of columns into int64, you may need to turn it into numeric:
df.columns = pd.to_numeric(df.columns)
For interpolation, I think the only way would be creating that nonexistent index and column first, then you can get that value. However, it will grow your df rapidly if it's frequently query.
- First, you need to add the nonexistent index and column.
- Interpolate row-wise and column-wise.
- Get your value.
new_index = df.index.to_list()
new_index.append(125)
new_index.sort()
new_col = df.columns.to_list()
new_col.append(-50)
new_col.sort()
df = df.reindex(index=new_index, columns=new_col)
df = df.interpolate(axis=1).interpolate()
print(df[125, -50])
Another way is to write a function to fetch relative numbers and returns the interpolate result.
- Find the upper and lower indexes and columns of your target.
- Fetch the four numbers.
- Sequentially interpolate the index and column.