I have a problem, I would like to merge the information from two dataframes into one but the information does not line up well and does not match (see pictures). Do you know how to do this?
Code
import pandas as pd
members = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/members.csv")
expeditions = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/expeditions.csv")
Everest = members[members["peak_id"]=="EVER"]
Everest = pd.merge(Everest, expeditions[['expedition_id','mois']], on='expedition_id', how='inner')
Everest_success = Everest[Everest["success"]==True]
everest_success_mois = Everest_success[["mois","success"]].groupby("mois",as_index=False).count()
essais_everest_mois = Everest[["mois","success"]].groupby("mois",as_index=False).count()
essais_everest_mois = essais_everest_mois.rename(columns={"success":"nbre_essais"})
everest_success_mois['nbre_essais'] = essais_everest_mois["nbre_essais"]
everest_success_mois["pourcentage_success_mois"] = np.where(everest_success_mois["nbre_essais"]<1, everest_success_mois["nbre_essais"], everest_success_mois["success"]/everest_success_mois["nbre_essais"]*100)
CodePudding user response:
From @Corralien comment:
dfL10.rename(columns={'success': 'nbre_essais'}).merge(dfR7, on='mois', how='right')