Home > Enterprise >  How to combine two columns into one column with no repeating values in python using pandas?
How to combine two columns into one column with no repeating values in python using pandas?

Time:11-06

How do I perform the following dataframe operation going from Dataframe A to dataframe B in pandas for python? I have tried pivot and groupby but I keep getting errors. Any support is greatly appreciated.

DataFrame A

Col A Col B
100 1
100 2
200 3
200 4

DataFrame B

Col A & B
1
2
100
3
4
200

CodePudding user response:

One option using groupby:

out = (df
   .groupby('Col A', group_keys=False, sort=False)
   .apply(lambda d: d.iloc[:, ::-1].unstack().drop_duplicates())
   .reset_index(drop=True).to_frame(name='Col A&B')
)

Another with concat:

out = (pd
   .concat([df['Col B'], df['Col A'].drop_duplicates(keep='last')])
   .sort_index().reset_index(drop=True).to_frame(name='Col A&B')
)

output:

   Col A&B
0        1
1        2
2      100
3        3
4        4
5      200

If order does not matter, you can stack:

out = df.stack().drop_duplicates().reset_index(drop=True).to_frame(name='Col A&B')

output:

   Col A&B
0      100
1        1
2        2
3      200
4        3
5        4

CodePudding user response:

Another possible solution:

out = pd.DataFrame({'Col A&B': np.unique(df)})
out

Output:

   Col A&B
0        1
1        2
2        3
3        4
4      100
5      200
  • Related