Home > Blockchain >  for loop to add new values to column
for loop to add new values to column

Time:03-26

So I am trying to add a new column to my dataframe that contains the side/radius given the shape and area of each row.

My original dataset looks like this:

df:

    shape     color   area  
0   square    yellow  9409.0    
1   circle    yellow  4071.5    
2   triangle  blue    2028.0    
3   square    blue    3025.0

But when I coded it like this:

df['side'] = 0
for x in df['shape']:
    if x == 'square':
        df['side'] = np.rint(np.sqrt(df['area'])).astype(int)
    elif x == 'triangle':
        df['side'] = np.rint(np.sqrt((4 * df['area'])/np.sqrt(3))).astype(int)
    elif x == 'circle':
        df['side'] = np.rint(np.sqrt(df['area']/np.pi)).astype(int)

I got:

    shape     color   area    size
0   square    yellow  9409.0  55
1   circle    yellow  4071.5  36    
2   triangle  blue    2028.0  25    
3   square    blue    3025.0  31    

It looks like the loop is adding the elif x == 'circle' clause to the side column for every row.

CodePudding user response:

Looks like it's a good use case for numpy.select, where you select values depending on which shape it is:

import numpy as np
df['side'] = np.select([df['shape']=='square', 
                        df['shape']=='circle', 
                        df['shape']=='triangle'], 
                       [np.rint(np.sqrt(df['area'])), 
                        np.rint(np.sqrt(df['area']/np.pi)), 
                        np.rint(np.sqrt((4 * df['area'])/np.sqrt(3)))], 
                       np.nan).astype(int)

It could be written more concisely by creating a mapping from shape to multiplier; then use pandas vectorized operations:

mapping = {'square': 1, 'circle': 1 / np.pi, 'triangle': 4 / np.sqrt(3)}
df['side'] = df['shape'].map(mapping).mul(df['area']).pow(1/2).round(0).astype(int)

Output:

      shape   color    area  side
0    square  yellow  9409.0    97
1    circle  yellow  4071.5    36
2  triangle    blue  2028.0    68
3    square    blue  3025.0    55

CodePudding user response:

I see you were assigning to the columns. you can iterate over each row and edit it as you iterate over it using iterrows () method on dataFrame.

for i, row in df.iterrows():
    if row['shape'] == 'square':
        df.at[i,'side'] = np.rint(np.sqrt(row['area'])).astype(int)
    elif row['shape'] == 'triangle':
        df.at[i,'side'] = np.rint(np.sqrt((4 * row['area'])/np.sqrt(3))).astype(int)
    elif row['shape'] == 'circle':
        df.at[i,'side'] = np.rint(np.sqrt(row['area']/np.pi)).astype(int)

note the assignment is to cell of a column on row at index i.

also, suggestion by @enke above will work just fine.

  • Related