Home > OS >  Why can't I update dataframe using loc?
Why can't I update dataframe using loc?

Time:07-24

I'm trying to update the range column in this dataframe:

            Range  percentchange
Date                                                                    
2014-01-06  0     -13.113459
2014-01-07  0       6.693942
2014-01-08  0      -0.191734
2014-01-09  0       2.219851
2014-01-10  0       4.959282

Using the following code prints an unaltered df, I want to update the range column with a number at each row what am I doing wrong?

for i, row in newBTC.iterrows():
    df_column_percentchange = newBTC.loc[i, 'percentchange']
    df_column_range = newBTC.loc[i, "Range"]
    if -100 <= df_column_percentchange <= -11:
        df_column_range = -10
    if -11 <= df_column_percentchange <= -9:
        df_column_range = -9
    if -9 <= df_column_percentchange <= -7:
        df_column_range = -8
    #etc...

CodePudding user response:

@TimRoberts and @BeRT2me gave you the correct explanation. However, even if you fix that, you could do much better with pd.cut:

# For demo purpose only, adapt to your real case
bins = [-20, -1, 1, 20]
labels = [-10, 0, 10]
df['Range'] = pd.cut(df['percentchange'], bins, labels=labels).astype(int)
print(df)

# Output
         Date  Range  percentchange
0  2014-01-06    -10     -13.113459
1  2014-01-07     10       6.693942
2  2014-01-08      0      -0.191734
3  2014-01-09     10       2.219851
4  2014-01-10     10       4.959282

CodePudding user response:

iterrows is a trap. It's basically never the best way, and leads to more problems than it's worth.

This still isn't a good way of doing it, but if you can't find a good pandas specific way to do something, at least implement looping like this:

def set_range(x):
    # changed to show more interesting results,
    # make it how you want.
    if -100 <= x <= -11:
        return -10
    elif -11 <= x <= 0:
        return -9
    elif 0 <= x <= 5:
        return -8
    else:
        return 0

df['Range'] = df.percentchange.apply(set_range)
print(df)

Output:

            Range  percentchange
Date
2014-01-06    -10     -13.113459
2014-01-07      0       6.693942
2014-01-08     -9      -0.191734
2014-01-09     -8       2.219851
2014-01-10     -8       4.959282

@Corralien has a wonderful example of a pandas specific way to approach your problem.

CodePudding user response:

This should work for you (using row inside the loop):

newBTC = pd.DataFrame(
    {
        'Date': ['2014-01-06', '2014-01-07', '2014-01-08', '2014-01-09', '2014-01-10'], 
        'Range': [0]*5, 
        'percentchange': [-13.113459, 6.693942, -0.191734, 2.219851, 4.959282]
    }
)
newBTC['Date'] = pd.to_datetime(newBTC['Date'], errors='coerce')
newBTC.set_index('Date', inplace=True)

for i, row in newBTC.iterrows():
    df_column_percentchange = row.percentchange
    df_column_range = row.Range
    if -100 <= df_column_percentchange <= -11:
        df_column_range = -10
    if -11 <= df_column_percentchange <= -9:
        df_column_range = -9
    if -9 <= df_column_percentchange <= -7:
        df_column_range = -8
    if 4 <= df_column_percentchange <= 7:
        df_column_range = 99 # test value
    newBTC.at[i, 'Range'] = df_column_range

print(newBTC)

Result:

            Range  percentchange
Date                            
2014-01-06    -10     -13.113459
2014-01-07     99       6.693942
2014-01-08      0      -0.191734
2014-01-09      0       2.219851
2014-01-10     99       4.959282
  • Related