I am familiar with MATLAB through uni work but have been using python lately as that is what I currently have access to.
I am working with battery data that is segmented by cycle number and charge / discharge capacity as a function of time. My problem is that I want to create a data structure that segments this data easily. In MATLAB I would use a cell array where the top level cell represents the cycle number, and the second level cell contains charge/discharge data.
For example data{1}{1} would give me cycle 1 charge capacity and data{4}{2} would be the cycle 4 discharge capacity.
What is the best way to replicate this structure in python?
I have the raw data file currently in a dataframe where my "cycle number" column is just the current cycle number and the "charge capacity" column is the corresponding value increasing with time. However this is not segmented and is ~30000 rows of data
CodePudding user response:
In Pandas, you can store data in a DataFrame, which is similar to a table in a relational database. To store data in a DataFrame similar to a MATLAB cell array, you can create a column with a data type of "object" and store lists, dictionaries, or other DataFrames in that column.
For example, you can create an empty DataFrame and add a column called "cell_array" with a data type of "object":
import pandas as pd
df = pd.DataFrame()
df["cell_array"] = df["cell_array"].astype(object)
You can also add different data types in the same column.
df.loc[1, "cell_array"] = {'name': 'John', 'age': 30}
You can also store other DataFrames in the cell_array.
df2 = pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})
df.loc[2, "cell_array"] = df2
You can access the data stored in the "cell_array" column by using the iloc or loc accessor.
df.loc[0, "cell_array"]
It will return the first element of the cell_array which is list [1,2,3]