Home > Software engineering >  how to filter 2d array on the condition of the next element of the array in python
how to filter 2d array on the condition of the next element of the array in python

Time:12-24

How do I filter a 2d array and keep only those elements that are meeting the condition that if there are 2 clicks coming one after another and then tocart, filter the array from the first click Example

df = pd.DataFrame({
  'a': ['Jason', 'Jason', 'Boby', 'Boby', 'Boby','Boby','Boby','Cob'],
  'b': [1, 2, 5, 5, 4,2,1, 6],
  'c': ['x', 'y', 'z', 'x', 'y','d', 'd','z'],
  'd': ['click', 'click', 'tocart', 'click', 'tocart','click','click', 'tocart']
})


df  = df.groupby(["a"]).apply(lambda x: x.sort_values(["b"], ascending = True)).reset_index(drop=True)


df['combine'] = df[['b','c','d']].values.tolist()

df = df[['a','combine']].groupby('a').agg(pd.Series.tolist).reset_index()
df

In case of Boby

a combine
Boby [[1, d, click],[2, d, click], [4, y, tocart], [5, x, click],[5, z, tocart]]
Cob [[6, z, tocart]]

I want to lose the first click from the array bc after it comes one more click and then tocart. Cob shoulb not be in the outcome df as there is no "click" in his array and Jason has no click in his array.

the outcome I expect

a combine
Boby [[2, d, click], [4, y, tocart], [5, x, click],[5, z, tocart]]

CodePudding user response:

Would something like this work? Basically does more or less what you describe:

def slicing(y):
    x = y[y['d'].shift() != y['d']].to_numpy()
    if np.isin(['click', 'tocart'], x[:,-1]).all():
        return x
    else:
        return np.nan
out = df.sort_values(by='b').groupby('a').apply(slicing).dropna()

Output:

a
Boby    [[5, z, click], [5, x, tocart]]
dtype: object
  • Related