Home > Software engineering >  Append column to data frame with text based on another column value
Append column to data frame with text based on another column value

Time:08-07

In this dataframe, how to go about appending a column named "class_name", with a text string, that is based on another column.

x y z not used Label
-3.8481877 -0.47685334 0.63422906 1.0396314 1
-2.320888 0.65347993 1.1519914 0.12997247 1
1.5827686 1.4119303 -1.7410104 -4.6962333 2
-0.1337152 0.13315737 -1.6648949 -1.4205348 2
-0.4028037 1.332986 1.3618442 0.3292255 1
-0.015517877 1.346349 1.4083523 0.87017965 0
-0.2669228 0.5478992 -0.06730786 -1.5959451 0
-0.03318152 0.3263167 -2.116833 -5.4616213 1

There are the values the new column will take based on the values in the 'Label' column:

0 == 'avocados'
1 == 'apples'
2 == ' grapes

This is my code so far:

import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
import pandas as pd

df = pd.read_csv('embed1_2.csv')

df.loc[df.y_train == 103, 'class_name'] = 'avocados'
df.loc[df.y_train == 103, 'class_name'] = 'apples'
df.loc[df.y_train == 103, 'class_name'] = 'grapes'

How to get the appended column to show up with the converted text?

Thanks for your help!

CodePudding user response:

create a dictionary and then use map in creating a new columns

dict = {
0 : 'avocados',
1 : 'apples',
2 : 'grapes' 
}
df['val']=df['Label'].map(dict)
df
            x           y          z    not used    Label   val
0   -3.848188   -0.476853   0.634229    1.039631    1   apples
1   -2.320888   0.653480    1.151991    0.129972    1   apples
2   1.582769    1.411930    -1.741010   -4.696233   2   grapes
3   -0.133715   0.133157    -1.664895   -1.420535   2   grapes
4   -0.402804   1.332986    1.361844    0.329226    1   apples
5   -0.015518   1.346349    1.408352    0.870180    0   avocados
6   -0.266923   0.547899    -0.067308   -1.595945   0   avocados
7   -0.033182   0.326317    -2.116833   -5.461621   1   apples
  • Related