Home > OS >  updating a column, based on a condition returning itself or another column value
updating a column, based on a condition returning itself or another column value

Time:11-01

I have spent 4 hours trying to solve this problem and cannot get it working. I have nowhere to turn, any help would be most appreciated.

I have

  • a column called tdf["current"]
  • for any row with tdf["current"] == "£ 0.00"
  • I need to replace tdf["current"] with that rows tdf["old"] value

Attempt 1

tdf['currrent'] = tdf['currrent'].apply(lambda x: tdf['old'] if x == "£ 0.00" else tdf['currrent'])

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Attempt 2

for idx, row in tdf.iterrows():
    row['current'] = row['current'].apply(lambda x: row['old'] if x == "£ 0.00" else row['current'])

AttributeError: 'str' object has no attribute 'apply'

Attempt 3

for x in range(int(tdf.shape[0])):
    if tdf["old"].values[x] == "£ 0.00":
        print(tdf.at[tdf.index[x], "old"])
This returns duplicate rows for the total df size, for each row, this is pretty broken, I gave up on this route

Attempt 4 Getting lost in my head now

for x in range(int(tdf.shape[0])):
    if tdf["old"].values[x] == "£ 0.00":
        print(tdf.at[tdf.index[x], "old"])
    else:
        print(False)

This looked like it was going to work, it runs for False, when it reaches truthy ValueError: Invalid call for scalar access (getting)!

Attempt 5 and beyond

Variations of loops and using .loc[] and .at[]

CodePudding user response:

Purely on the description, it looks like you could use mask (or where):

tdf["current"]= tdf["current"].mask(tdf["current"].eq("£ 0.00"), tdf["old"])

NB. with pandas, you often want to avoid loops and apply, for most tasks, there is likely an efficient way to do it without loops.

  • Related