How to store a complete column in other df and store its index from second table to original table u-CodePudding

I have a pandas data frame that looks like this.

   Column1  Column2  Column3
0     DS      4.5     Hard
1     ML      2.5     Medium
2     CS       4      Hard

I want to check if any column is having a duplicate value if yes then need to store the unique value of that column in another df and store its index in the original position.

Like in this case we will have output as two df like below:

df1:

   Column1  Column2  Column3
0     DS      4.5      0
1     ML      2.5      1
2     CS       4       0

d2:

   Column1
0    Hard
1    Medium

CodePudding user response：

So, my approach would be to firstly create the dataframe containing only the unique values of Column3:

df1 = pd.DataFrame(df1['Column3'].unique())
df1.columns = ['Column3']

Looks like this:

    Column3
0   Hard
1   Medium

Then we can replace the Column3 values with the indices by using pandas replace() method:

df2 = df.replace(to_replace=df1.values, value=df1.index.values)

Output:

    Column1 Column2 Column3
0   DS  4.5 0
1   ML  2.5 1
2   CS  4.0 0