From the given table of inflation rates below, I want to obtain the countries with negative inflation rates for two consecutive years.
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 47.0 -7.0 -19.0 38.0 260.0
Argentina NaN 33.0 56.0 -22.0 15.0 8.0
Suriname NaN -68.0 -37.0 695.0 56.0 13.0
Zimbabwe NaN 106.0 2306.0 118.0 -83.0 -21.0
Lebanon NaN 2.0 -36.0 2826.0 82.0 39.0
Sudan NaN 96.0 -19.0 220.0 19.0 34.0
Venezuela NaN 1482.0 -70.0 -88.0 15.0 -89.0
I have seen some solutions in SO that use list comprehension or loops. I wonder if this task is possible without them.
I attempted to convert the dataframe into 1s and 0s, in which 1.0 indicates a negative inflation.
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 0.0 1.0 1.0 0.0 0.0
Argentina NaN 0.0 0.0 1.0 0.0 0.0
Suriname NaN 1.0 1.0 0.0 0.0 0.0
Zimbabwe NaN 0.0 0.0 0.0 1.0 1.0
Lebanon NaN 0.0 1.0 0.0 0.0 0.0
Sudan NaN 0.0 1.0 0.0 0.0 0.0
Venezuela NaN 0.0 1.0 1.0 0.0 1.0
However, I am stuck at this point. I tried to use np.prod
function but this returns 0 if at least one column as 0.0 data.
Any ideas about how to solve this problem?
CodePudding user response:
You can first set an integer mask for the negative values (1 means negative). Then compute a rolling min on the axis 1, of the min is 1 all values are. This is generalizable to any number of consecutive columns.
N = 2
m1 = df.lt(0).astype(int)
m2 = m.rolling(N, axis=1).min().eq(1).any(1)
df[m2]
Output:
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 47.0 -7.0 -19.0 38.0 260.0
Suriname NaN -68.0 -37.0 695.0 56.0 13.0
Zimbabwe NaN 106.0 2306.0 118.0 -83.0 -21.0
Venezuela NaN 1482.0 -70.0 -88.0 15.0 -89.0
NB. One needs to work with integers as rolling is currently limited to numeric types
Alternative with a single mask for N=2
m = df.lt(0)
df[(m&m.shift(axis=1)).any(1)]
CodePudding user response:
Try this:
match = (df.lt(0) & df.shift(axis=1).lt(0)).any(axis=1)
df[match]
How it works:
df.lt(0)
: current year inflation is less than 0df.shift(axis=1).lt(0)
: previous year inflation is less than 0.any(axis=1)
: any such occurrence in the country.
CodePudding user response:
Given your dataframe, this is what would work for me:
- set the Country as an index so I just have digits in my df values
- Define new column for check of 'Two sequential negatives' in columns using df.shift(axis=1).
So it would look like:
df.set_index('Country',inplace=True)
df['TwoNegatives'] = ((df.values < 0) & ((df.shift(axis=1)).values <0)).any(axis=1)
CodePudding user response:
Try with rolling
out = df[df.le(0).T.rolling(window=2).sum().ge(2).any()]
Out[15]:
2017 2018 2019 2020 2021 2022
Country
Turkey NaN 47.0 -7.0 -19.0 38.0 260.0
Suriname NaN -68.0 -37.0 695.0 56.0 13.0
Zimbabwe NaN 106.0 2306.0 118.0 -83.0 -21.0
Venezuela NaN 1482.0 -70.0 -88.0 15.0 -89.0