I have a couple of Pandas dataframes that I am trying to merge together without any luck.
The first dataframe (let's call it dataframe A) looks a little like this:
offer | type | country
------|------|--------
123 | A | UK
456 | B | UK
789 | A | ROI
It's created by reading in an .xlsx file using the following code:
file_name = "My file.xlsx"
df_a = pd.read_excel(file_name, sheet_name = "Offers Plan", usecols= ["offer","type","country"], dtype={'offer': str})
The offer column is being read in as a string because otherwise they end up in the format 123.0. The offers need to be in the format 123 because they're being used in some embedded SQL later on in the code that looks for them in a certain database table. In this table the offers are in the format 123, so the SQL will return no results when looking for 123.0.
The second dataframe (dataframe B) looks a little like this:
offer
-----
123
456
789
123
456
123
What I want to do is merge the two dataframes together so the results look like this:
offer | type | country
------|------|--------
123 | A | UK
456 | B | UK
789 | A | ROI
123 | A | UK
456 | B | UK
123 | A | UK
I've tried using the following code, but I get an error message saying "ValueError: You are trying to merge on int64 and object columns":
df_merged = pd.merge(df.b, df.a, how = 'left', on="offer")
Does anyone know how I can merge the dataframes correctly please?
CodePudding user response:
IIUC you can just change the df_a column to an int
df_a['offer'] = df_a['offer'].astype(int)
This will change it from a float/str to an int. If this gives you an error about converting from a str/float to an int check to make sure that you don't have any NaN/Nulls in your data. If you do you will need to remove them for a successful conversion.