I have a problem transferring data from an excel file into a 3D array. The different variables of data is latitude, longitude and depth of an area in western norway. The plan is to later have a reinforcement learning algorithm use this map of a 3D array to learn navigate the area at sea, i.e. where the depth is positive. We're essentially in the first step of making an automated route-planner.
This is the code I've used so far, which seems to create 3 different arrays, with the number of elements in the excel file used as rows for each individual one.
import pandas as pd
import numpy as np
file_loc = "excel1.xlsx"
Lat = pd.read_excel(file_loc,sheet_name='Sheet2', index_col=None, na_values=['NA'], usecols="A")
long = pd.read_excel(file_loc,sheet_name='Sheet2', index_col=None, na_values=['NA'], usecols="B")
Depth = pd.read_excel(file_loc,sheet_name='Sheet2', index_col=None, na_values=['NA'], usecols="C")
treD_array = np.array([[[Lat], [long]], [Depth]])
print(treD_array)
And this is what I get as an output;
VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. treD_array = np.array([[[Lat], [long]], [Depth]]) [list([[ 60.262092 0 60.262092 1 60.262092 2 60.262092 3 60.262092 4 60.262092 ... ... 23973 60.444314 23974 60.444314 23975 60.444314 23976 60.444314 23977 60.444314
[23978 rows x 1 columns]], [ 5.12317 0 5.123893 1 5.124616 2 5.125339 3 5.126063 4 5.126786 ... ... 23973 5.277191 23974 5.277915 23975 5.278638 23976 5.279361 23977 5.280084
[23978 rows x 1 columns]]]) list([ 27.47 0 42.26 1 60.70 2 81.17 3 100.91 4 118.34 ... ... 23973 187.83 23974 155.94 23975 104.80 23976 74.50 23977 54.84
[23978 rows x 1 columns]])]
Where I do not understand the warning at all.
Any help would be much appreciated :)
CodePudding user response:
You are importing the columns as dataframes. In order to generate an np.array
you have to convert them to list first:
treD_array = np.array([Lat.iloc[:,0].tolist(), long.iloc[:,0].tolist(), Depth.iloc[:,0].tolist()])
CodePudding user response:
You are receiving this warning because of your syntax for creating your numpy array.
I think you are trying to build a numpy array that contains lists in its first column, with columns structured like ["coords", "Z"], with "coords" being lists containing ["lat", "lon"]. This should be avoided and it is best practice to use a structure like ["lat", "lon", "Z"] (which is why you are receiving this warning).
As @imburningbabe indicated, you could convert the dataframes returned by pd.read_excel()
to lists.
I would suggest a more direct method of reading all data in a single dataframe, then converting to numpy array:
treD_dataframe = pd.read_excel(input_file, sheet_name='Sheet2', na_values=['NA'], usecols="A:C")
treD_array = treD_dataframe.to_numpy()
print(treD_array)