Home > Mobile >  What is an efficient way to create new dataframe column and populate values in python?
What is an efficient way to create new dataframe column and populate values in python?

Time:10-14

I have a pair of columns "car_model" and "year" that I need to send to a function as a tuple, and it will return me a price(float).

How to iterate over dataframe rows, send the "car_model" and "year" values to the function and add the returned value in the new column "price"?

I was thinking about:

model_year = CAR[["car_model", "year"]]

for x in model_year.to_numpy():
    model_year_tuple = tuple(x)
    price = calculate_price(model_year_tuple)
    //how to add to the column? the line below will always use the last calculated price
    CAR['price'] = price

CodePudding user response:

We can do

model_year['out'] = model_year.agg(tuple,1).map(calculate_price)

CodePudding user response:

Try apply:

CAR['price'] = model_year.apply(lambda x: calculate_price(tuple(x)), axis=1)

Or list comprehension:

CAR['price'] = [calculate_price(x) for x in zip(CAR['car_model'], CAR['year'])]

That said, you should try rewrite your calculate_price function so that it accepts numpy array instead of vanilla python tuples.

CodePudding user response:

this should work

 df['price'] = df.apply(lambda x: price((x['car_model'],x['year'])))
  • Related