Home > Enterprise >  try to interpolate data from a csv file, getting error 'x and y arrays must be equal in length
try to interpolate data from a csv file, getting error 'x and y arrays must be equal in length

Time:12-01

I'm completely new to python and data science with it, so please bear with me here. I appreciate every help and try to understand it as much as possible. Following code is what I got so far.

import pandas as pd
import numpy as np
from pandas import read_csv
from matplotlib import pyplot
import matplotlib.pyplot as plt
from scipy import interpolate
from scipy.interpolate import interp1d
from scipy.interpolate import interp2d

dataset = pd.read_csv(r"C:\Users\...\csv\Test1K.csv", sep=';', skiprows=0)
x = dataset.iloc[0:1420, 0:1].values
y = dataset.iloc[0:1420, 3:4].values

f = interpolate.interp1d(x, y, kind = 'cubic')
xn =270
t_on = f(xn)
print(t_on)

first rows of output from the csv file looks like this:

0       [s]     [Celsius]     [Celsius]  [Celsius]   [Celsius]  [Celsius]
1         0        22.747        22.893      0.334      22.898     22.413
2        60        22.769        22.902     22.957      22.907     -0.187
3       120         22.78        22.895     25.519      22.911     -2.739
4       180        22.794        22.956      33.62      22.918    -10.827

short thing about what I try to do and where the problem is. I have this csv file, where there is a alot of data in it, with temperature readings every 60 seconds, for like 1400 readings. Now I want to interpolate that, so I can get a specific data point between each 60 seconds and possible even further than the 1400 iterations. (maybe up to 1600) The first dataset I want, is the third celsius one. The code above is how far I got so far. Now I get the error code

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_8008/3128405253.py in <module>
      7 
      8 
----> 9 f = interpolate.interp1d(x, y, kind = 'cubic')
     10 yn =270  # Wert auf den Interpoliert werden soll
     11 t_on = f(yn)

~\AppData\Roaming\Python\Python38\site-packages\scipy\interpolate\interpolate.py in __init__(self, x, y, kind, axis, copy, bounds_error, fill_value, assume_sorted)
    434                  assume_sorted=False):
    435         """ Initialize a 1-D linear interpolation class."""
--> 436         _Interpolator1D.__init__(self, x, y, axis=axis)
    437 
    438         self.bounds_error = bounds_error  # used by fill_value setter

~\AppData\Roaming\Python\Python38\site-packages\scipy\interpolate\polyint.py in __init__(self, xi, yi, axis)
     52         self.dtype = None
     53         if yi is not None:
---> 54             self._set_yi(yi, xi=xi, axis=axis)
     55 
     56     def __call__(self, x):

~\AppData\Roaming\Python\Python38\site-packages\scipy\interpolate\polyint.py in _set_yi(self, yi, xi, axis)
    122             shape = (1,)
    123         if xi is not None and shape[axis] != len(xi):
--> 124             raise ValueError("x and y arrays must be equal in length along "
    125                              "interpolation axis.")
    126 

ValueError: x and y arrays must be equal in length along interpolation axis.

I searched for solutions and got this for example:

x = np.linspace(0, 4, 13)
y = np.linspace(0, 4, 13)  
X, Y = np.meshgrid(x, y)  
z = np.arccos(-np.cos(2*X) * np.cos(2*Y))
f = interpolate.interp2d(x, y, z, kind = 'cubic')

I read at other problems, that the 2d solution should help, but when I put it like this I get:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_8008/1890924245.py in <module>
     14 f = interpolate.interp2d(x, y, z, kind = 'cubic')
     15 yn =270  # Wert auf den Interpoliert werden soll
---> 16 t_on = f(yn)
     17 print(t_on)
     18 

TypeError: __call__() missing 1 required positional argument: 'y'

Now I get it, I need something for y, but wouldnt that ruin the whole thing I try to do? My result should be my y and not one of my inputs. If anyone can help me with my code or even has completely different other solution, I appreciate everthing. Thank you all for the help

Edit: the whole error message

CodePudding user response:

I think your issue is here:

x = dataset.iloc[0:1420, 0:1].values
y = dataset.iloc[0:1420, 3:4].values

The result of a 0:1 slice will be a DataFrame, and you will end up with a two-dimensional array with the shape (1420, 1). You need a 1d array for interp1d, so you should just do

x = dataset.iloc[0:1420, 0].values
y = dataset.iloc[0:1420, 3].values

The shape of x and y will be (1420,) (1d array).

  • Related