Iterating over rows of a dataframe, and assigning multiple calculated values to the rows-CodePudding

I have a df:

dict1 = {'A': 1, 'B': 2, 'C': 3, 'D': 4}
dict2 = {'A': 10, 'B': 20, 'C': 30, 'D': 40}
dict3 = {'A': 100, 'B': 200, 'C': 300, 'D': 400}
df = pd.DataFrame([dict1, dict2, dict3])

(I'm working from home, I can't copy paste the output here, sorry)

Now, I would like to 'enlarge' df, then assign calculated values to the new columns.

df[['new_col1', 'new_col2']] = None
for idx, row in df.iterrows():
    # insert the calculated values for `new_col1` and `new_col2` here

I think I do need to iterate over the rows, as the calculation is based on the values of the rows. I can of course manually insert the values for each cell one by one using .at, but I have hundreds of thousands of rows, and ~20 calculated values to fill in. How can I do this?

I tried:

dictt = {'new_col1': 1, 'new_col2': 2}
df.iloc[0] = df.iloc[0].map(dictt)

But then if I check what df.iloc[0] is, its a row of NaN. I also tried:

df.iloc[0] = df.iloc[0].replace(dictt)

But that didn't do anything. Also, if there is a better/ more proper way to do operations like this, I'm all ears.

CodePudding user response：

If you have some heavy complicated function main bottleneck is in this function, not in pandas, here is solution how iterate in DataFrame.apply:

def f(a, b):
    return pd.Series({'new_col1': 1   a, 'new_col2': 2   b})

df = df.join(df.apply(lambda x: f(x.A, x.B), axis=1))
print (df)
     A    B    C    D  new_col1  new_col2
0    1    2    3    4         2         4
1   10   20   30   40        11        22
2  100  200  300  400       101       202

Another idea:

def f(a, b):
    return (1   a,  2   b)

df[['col1','col2']] = df.apply(lambda x: f(x.A, x.B), axis=1, result_type='expand')
print (df)
     A    B    C    D  col1  col2
0    1    2    3    4     2     4
1   10   20   30   40    11    22
2  100  200  300  400   101   202