I am building a list of dataframes using
_dataframe_collection_ = {}
I want to concatenate the data in these dataframes to build one big dataframe. If I do
for i in range(_num_roles_):
_dataframe_collection_[i] = pd.DataFrame(_roledata_json_[i])
_concatenated_dataframe_ = pd.concat(_dataframe_collection_[i])
print(_concatenated_dataframe_)
It fails on the pd.concat line with the following error
Error creating dataframe: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame".
How do I concatenate the data in all the dataframes into one dataframe.
CodePudding user response:
You can use a list comprehension to create the collection of dataframes that pd.concat
is expecting:
_concatenated_dataframe = pd.concat([pd.DataFrame(_roledata_json_[i]) for i in range(_num_roles_)])
CodePudding user response:
You need to pass a list of DataFrames to concatenate in the pd.concat()
method. It should look something like this pd.concat([df1, df2])
. I'd prefer a list over a dictionary unless you need the keys for mapping.
CodePudding user response:
dataframe_collection = {} is not an empty list, it is an empty dictionary. Use square brackets instead of curly.
If roledata_json is a list of json strings, you will need to use pandas.read_json(). Hard to give better advice without know what that object is.
You shouldn't perform more than a single concat for performance reasons. concat creates a new object. concat takes a list of dataframes as its input as stated in other answers.