Home > Enterprise >  How to Combine Pandas DataFrames that have the same columns and data types
How to Combine Pandas DataFrames that have the same columns and data types

Time:12-02

I have three dataframes that I need to combine but nothing I try works. I have been trying everything and nothing is working. So far this is what I have:

# DataFrame columns
columns = ["exchange", "symbol", "name"]

# Create NYSE dataFrame
NYSE = list(zip(NYSE_symbols, NYSE_companies))
NYSE = [("NYSE", )   elem for elem in NYSE]
NYSE_df = pd.DataFrame([x for x in NYSE], columns=columns)

# Create NASDAQ dataFrame
NASDAQ = list(zip(NASDAQ_symbols, NASDAQ_companies))
NASDAQ = [("NASDAQ", )   elem for elem in NASDAQ]
NASDAQ_df = pd.DataFrame([x for x in NASDAQ], columns=columns)

# Create OTCBB dataFrame
OTCBB = list(zip(OTCBB_symbols, OTCBB_companies))
OTCBB = [("OTCBB", )   elem for elem in OTCBB]
OTCBB_df = pd.DataFrame([x for x in OTCBB], columns=columns)

company_df = pd.merge(NYSE_df,NASDAQ_df,OTCBB_df, on='id')

print(company_df)

This is the error I get when I try concat:

cmwolfe@LAPTOP-NEJ5OHDU:~/python/repositories/MyOrm$ /home/cmwolfe/.virtualenvs/marketdb/bin/python3.8 /home/cmwolfe/python/repositories/MyOrm/app.py
/home/cmwolfe/python/repositories/MyOrm/app.py:73: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only
  company_df = pd.concat(NYSE_df,NASDAQ_df,OTCBB_df)
Traceback (most recent call last):
  File "/home/cmwolfe/python/repositories/MyOrm/app.py", line 73, in <module>
    company_df = pd.concat(NYSE_df,NASDAQ_df,OTCBB_df)
  File "/home/cmwolfe/.virtualenvs/marketdb/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/home/cmwolfe/.virtualenvs/marketdb/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 294, in concat
    op = _Concatenator(
  File "/home/cmwolfe/.virtualenvs/marketdb/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 329, in __init__
    raise TypeError(
TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"

This is what the data frames all look like:

0      NASDAQ   AACG     Ata Creativity Global ADR
1      NASDAQ   AACI     Armada Acquisition Corp I
2      NASDAQ  AACIU     Armada Acquisition Corp I
3      NASDAQ  AACIW  Armada Acquisition Corp I WT
4      NASDAQ   AADI          Aadi Biosciences Inc
...       ...    ...                           ...
5293   NASDAQ  ZWRKU                             Z
5294   NASDAQ  ZWRKW                             Z
5295   NASDAQ     ZY                  Zymergen Inc
5296   NASDAQ   ZYNE             Zynerba Pharma CS
5297   NASDAQ   ZYXI                     Zynex Inc

[5298 rows x 3 columns]

CodePudding user response:

You can use the pandas.DataFrame.append function, which let's you append rows of one dataset to another. I think that is what you want to todo, correct me if I'm wrong and you actually want to merge the datasets.

company_df = NYSE_df.append(NASDAQ_df)
company_df = result.append(OTCBB_df)

will save the all rows of all 3 datests into a new dataset company_df.

The problem with merge may be, that pd.merge expects only 2 dataframes, but as I said already, I think you want to append not merge.

  • Related