What is an efficient way to create new dataframe column and populate values in python?-CodePudding

I have a pair of columns "car_model" and "year" that I need to send to a function as a tuple, and it will return me a price(float).

How to iterate over dataframe rows, send the "car_model" and "year" values to the function and add the returned value in the new column "price"?

I was thinking about:

model_year = CAR[["car_model", "year"]]

for x in model_year.to_numpy():
    model_year_tuple = tuple(x)
    price = calculate_price(model_year_tuple)
    //how to add to the column? the line below will always use the last calculated price
    CAR['price'] = price

CodePudding user response：

We can do

model_year['out'] = model_year.agg(tuple,1).map(calculate_price)

CodePudding user response：

Try apply:

CAR['price'] = model_year.apply(lambda x: calculate_price(tuple(x)), axis=1)

Or list comprehension:

CAR['price'] = [calculate_price(x) for x in zip(CAR['car_model'], CAR['year'])]

That said, you should try rewrite your calculate_price function so that it accepts numpy array instead of vanilla python tuples.

CodePudding user response：

this should work

 df['price'] = df.apply(lambda x: price((x['car_model'],x['year'])))