Home > Enterprise >  How do I Concatenate Dataframes on the fly
How do I Concatenate Dataframes on the fly

Time:02-18

I am building a list of dataframes using _dataframe_collection_ = {}

I want to concatenate the data in these dataframes to build one big dataframe. If I do

for i in range(_num_roles_):
     _dataframe_collection_[i] = pd.DataFrame(_roledata_json_[i])
     _concatenated_dataframe_ = pd.concat(_dataframe_collection_[i])
     print(_concatenated_dataframe_)

It fails on the pd.concat line with the following error

Error creating dataframe: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame".

How do I concatenate the data in all the dataframes into one dataframe.

CodePudding user response:

You can use a list comprehension to create the collection of dataframes that pd.concat is expecting:

_concatenated_dataframe = pd.concat([pd.DataFrame(_roledata_json_[i]) for i in range(_num_roles_)])

CodePudding user response:

You need to pass a list of DataFrames to concatenate in the pd.concat() method. It should look something like this pd.concat([df1, df2]). I'd prefer a list over a dictionary unless you need the keys for mapping.

CodePudding user response:

dataframe_collection = {} is not an empty list, it is an empty dictionary. Use square brackets instead of curly.

If roledata_json is a list of json strings, you will need to use pandas.read_json(). Hard to give better advice without know what that object is.

You shouldn't perform more than a single concat for performance reasons. concat creates a new object. concat takes a list of dataframes as its input as stated in other answers.

  • Related