I have pandas DataFrame with column:
index, min_or_max
0, np.nan
1, np.nan
2, min
3, np.nan
4, np.nan
5, max
6, np.nan
7, np.nan
8, np.nan
9, max
10, np.nan
11, np.nan
12, min
I want to create col2
such that:
index, min_or_max, col2
0, np.nan np.nan
1, np.nan np.nan
2, min min
3, np.nan np.nan
4, np.nan np.nan
5, max np.nan
6, np.nan np.nan
7, np.nan np.nan
8, np.nan np.nan
9, max max
10, np.nan np.nan
11, np.nan np.nan
12, min min
how can I check for consecutive values in the column and take the last one?
- I can have multiple consecutive value max or min but repetition is max 10 in a row
- I can have repetitions of either in or max value
EDIT:
I tried this:
df1 = df[df['min_or_max'].ne(df['min_or_max'].shift(-1))]
df1["col2"] = df1["min_or_max"]
df1 = df1.reset_index()
df1 = df1[["index", "col2"]]
df = df.reset_index()
df = df[["index"]]
df = df.merge(df1, on="index", how="left")
EDIT:
my proposed solution:
df1 = df.dropna(subset=['min_or_max'], how='all', inplace=False)
df1 = df1[df1['min_or_max'].ne(df1['min_or_max'].shift(-1))]
df = df.reset_index()
df = df[["index", "min_or_max"]]
df1 = df1.reset_index()
df.columns = ["index", "col2"]
df1 = df1[["index", "col2"]]
df = df.merge(df1, on="index", how="left")
CodePudding user response:
Example:
s = pd.Series(['A', np.nan, 'A', 'bb', 'bb', 'A', 'A', 'bb', 'A', 'bb', 'bb'])
0 A
1 NaN
2 A
3 bb
4 bb
5 A
6 A
7 bb
8 A
9 bb
10 bb
dtype: object
Solution:
In each group: leave only the last value. (NaN
s are considered parts of the groups.)
s = s.bfill()
to_delete = s == s.shift(-1)
s[to_delete] = np.nan
Result:
0 NaN
1 NaN
2 A
3 NaN
4 bb
5 NaN
6 A
7 bb
8 A
9 NaN
10 bb
dtype: object