I have two dataframes of unequal lengths. I want to combine them with a condition.
If two rows of df1 are identical then they must share the same value of df2.(without changing order )
import pandas as pd
d = {'country': ['France', 'France','Japan','China', 'China','Canada','Canada','India']}
df1 = pd.DataFrame(data=d)
I={'conc': [0.30, 0.25, 0.21, 0.37, 0.15]}
df2 = pd.DataFrame(data=I)
dfc=pd.concat([df1,df2], axis=1)
my output
country conc
0 France 0.30
1 France 0.25
2 Japan 0.21
3 China 0.37
4 China 0.15
5 Canada NaN
6 Canada NaN
7 India NaN
expected output
country conc
0 France 0.30
1 France 0.30
2 Japan 0.25
3 China 0.21
4 China 0.21
5 Canada 0.37
6 Canada 0.37
7 India 0.15
CodePudding user response:
You need to create a link between the values and the countries first.
df2["country"] = df1["country"].unique()
Then you can use it to merge it with your original dataframe.
pd.merge(df1, df2, on="country")
But be aware that this only works as long as the number of the values is identical to the number of countries and the order for them is as expected.
CodePudding user response:
I'd construct the dataframe directly, without intermediate dfs.
d = {'country': ['France', 'France','Japan','China', 'China','Canada','Canada','India']}
I = {'conc': [0.30, 0.25, 0.21, 0.37, 0.15]}
c = 'country'
dfc = pd.DataFrame(I, index=pd.Index(pd.unique(d[c]), name=c)).reindex(d[c]).reset_index()