Can someone help me out to combine/Union 2 dataframe who has different shapes with different columns names.
As you can see 1 dataframe is with 3 columns and the 2nd dataframe contains only 1 column.
I want to put dataframe 02 clientid column below to dataframe 01 Turtlemint profile_id column and create new column with vertical name and mentioned value as Life against all the value we put under the 1st dataframe columns like I mentioned inside the image.
How can I achieve this?
CodePudding user response:
Using append
while changing the column name of your second dataframe, you can achieve this.
A simple, but similar example:
df1 = pd.DataFrame(data=np.array([[1,2,3], ['A', 'B', 'C']]).T, columns=['nr', 'lt'])
df2 = pd.DataFrame(data=['D', 'E', 'F'], columns=['lt2'])
df3 = df1.append(df2.rename(columns={'lt2':'lt'}))
df3.loc[df3['nr'].isnull(), 'Vertical'] = 'Life'
Dataframes:
df1:
nr lt
0 1 A
1 2 B
2 3 C
df2:
nr lt
0 1 A
1 2 B
2 3 C
0 NaN D
1 NaN E
2 NaN F
Output:
nr lt Vertical
0 1 A NaN
1 2 B NaN
2 3 C NaN
0 NaN D Life
1 NaN E Life
2 NaN F Life
CodePudding user response:
Here I have change the second DF column name same as first DF and then outer joined them, so we can the first DF extra columns.
Code:
df1 = pd.concat([df1, df2.rename(columns={'clientid': "Turtlemint profile_id"})], join='outer', axis=0).reset_index(drop=True)
above we haven't changed the second DF column name permeant so still I have use the column name clientid. Here the simple logic is if the ID exist in both table give value Life or else None.
ADD column:
df1['Vertical'] = np.where(df1['Turtlemint profile_id'].isin(df2['clientid']), 'LIFE', None)