Home > Blockchain >  Python: How do I use the if function when calling out a specific row?
Python: How do I use the if function when calling out a specific row?

Time:12-31

This is my data frame (labeled unp):

data frame unp

I want to change the row GDP_Growth which is currently blank to have the value of: unp.GDP_CAP - unp.GDP_CAP.shift(1)

If it fulfils the condition that the 'TIME' is not 2014 or >2014, else it should be N/A

Tried using the if function directly but it's not working:

if unp.loc[unp['TIME'] > 2014]:
  unp['GDP_Growth'] = unp.GDP_CAP - unp.GDP_CAP.shift(1)
 else:
return 

CodePudding user response:

You should avoid the if statement when using dataframes as it will be slower (less efficient).

In place, depending on what you need, you can use np.where().

because the dataframe in the question is a picture (as opposed to text), i give you the standard implementation, which looks like this:

import pandas as pd
import numpy as np

# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 4, 5],
                   'B': [5, 6, 7, 8, 9]})

# Use np.where() to select values from column 'A' where column 'B' is greater than 7
result = np.where(df['B'] > 7, df['A'], 0)

# Print the result
print(result)

The result of the above is this:

[0, 0, 0, 4, 5]

You will need to modify the above for your particular dataframe.

CodePudding user response:

The question in title is currently Python: How do I use the if function when calling out a specific row?, which my answer will not apply to. Instead, we will compute the derivate / 'growth' and selectively apply it.

Explanation: In Python, you generally want to use a functional programming style to keep most computations outside of the Python interpreter and instead work with C-implemented functions.

Solution:
A. Obtain the derivate/'growth'
For your dataframe df = pd.DataFrame(...) you can obtain the change in value for a specific column with df['column_name'].diff(), e.g.

# This is your dataframe
In : df
Out:
        gdp growth  year
    0    0   <NA>  2000
    1    1   <NA>  2001
    2    2   <NA>  2002
    3    3   <NA>  2003
    4    4   <NA>  2004

In : df['gdp'].diff()
Out: 
0    NaN
1    1.0
2    1.0
3    1.0
4    1.0
Name: year, dtype: float64

B. Apply it to the 'growth' column

In :df['growth'] = df['gdp'].diff()
df
Out: 
   gdp  growth  year
0    0     NaN  2000
1    1     1.0  2001
2    2     1.0  2002
3    3     1.0  2003
4    4     1.0  2004

C. Selectively exclude values If you then want specific years to have a certain value, apply them selectively

In : df['growth'].iloc[np.where(df['year']<2003)] = np.nan
df
Out:
   gdp growth  year
0    0    NaN  2000
1    1    NaN  2001
2    2    NaN  2002
3    3    1.0  2003
4    4    1.0  2004
  • Related