I have the following series
s = pd.Series({'01','02','03,'01','02','03','01','02','03'})
And the following list
l = ['A','B','C']
What I want to do is every time we reach the 01 string value, I advance one value in my list index and start filling up the apply result with the 'B' value.
s = pd.Series({'A','A','A','B','B','B','C','C','C'})
I know i should use apply but i am having trouble formating the condition
s.apply(increment)
def increment(s):
l = ['A','B','C']
count = 0
if s == '01':
count = 1
# Something wenting array here
return l[count]
CodePudding user response:
Here are two ways to achieve this:
Option 1
- Convert
pd.Series
tofloat
. - Get
diff
and applyne
to determine the breakpoints between the consecutive sequences. - Apply
cumsum
, but starting at0
(withsub
). At this stage, we will have apd.Series
with values as[0, 0, 0, 1, 1, 1, 2, 2, 2]
. - Finally, we convert
lst
to adict
, with the indices as keys, so that we can applymap
.
We can put this is a one-liner:
import pandas as pd
s = pd.Series(['01','02','03','01','02','03','01','02','03'])
lst = ['A','B','C']
result = (s.astype(float).diff().ne(1)).cumsum().sub(1)\
.map({k: v for k, v in enumerate(lst)})
print(result)
0 A
1 A
2 A
3 B
4 B
5 B
6 C
7 C
8 C
Option 2
- Use
groupby
withgroups
to find all the indices ("labels") for value01
. - Next, turn
lst
intopd.Series
with found indices asindex
and feed this to a map apply to apd.Series
withs.index
. This will get us "A" at 0, "B" at 3, "C" at 6. - Finally, apply
ffill
.
# Int64Index([0, 3, 6], dtype='int64')
indices_one = s.groupby(s).groups['01']
result2 = pd.Series(s.index).map(pd.Series(lst, index=indices_one)).ffill()
result2.equals(result)
# True
CodePudding user response:
I think this is a simple approach using apply
s = pd.Series(['01','02','03','01','02','03','01','02','03'])
l = ['A','B','C']
def return_val(s):
global l
val = l[0]
if s == '03':
l = l[1:]
return val
s.apply(return_val)
output:
0 A
1 A
2 A
3 B
4 B
5 B
6 C
7 C
8 C
dtype: object
If you want to avoid using globals you might use reduce and partial from functools, although this approach might not be that clear:
from functools import reduce, partial
def return_val(prev_val, s, where_to_add: list):
l = ["",'A','B','C']
if s == '01':
where = l.index(prev_val)
prev_val = l[where 1]
where_to_add.append(prev_val)
return prev_val
my_list = []
return_val_partial = partial(return_val, where_to_add=my_list)
reduce(return_val_partial , s.values, "")
result = pd.Series(my_list)
print(result)
CodePudding user response:
You can do:
out = pd.Series([np.nan]*len(s))
out.loc[s.loc[s=='01'].index] = l
out = out.ffill()
print(out):
0 A
1 A
2 A
3 B
4 B
5 B
6 C
7 C
8 C
Explanation:
- Initialize a series of
NaN
with length equal to original series - Locate index in original series
s
where value is01
and populate those locations with those from your listl = [A, B, C]
- Then ffill