I have a dataframe with a column composed by lists, as below:
sessionId split
0 117200 [8=FIX.4.4, 9=401, 35=F, 34=342375]
1 117200 [8=FIX.4.4, 9=454, 35=G, 34=342374]
2 117200 [8=FIX.4.4, 9=430, 35=G, 34=342373]
3 173335 [8=FIX.4.4, 9=444, 35=G, 34=272236]
4 133911 [8=FIX.4.4, 9=359, 35=G, 34=25355]
I'd like to retrieve the index of the list in which the substring '35=' appears. The expected result would be like:
sessionId split idx
0 117200 [8=FIX.4.4, 9=401, 35=F, 34=342375] 2
1 117200 [8=FIX.4.4, 9=454, 35=G, 34=342374] 2
2 117200 [8=FIX.4.4, 9=430, 35=G, 34=342373] 2
3 173335 [8=FIX.4.4, 9=444, 35=G, 34=272236] 2
4 133911 [8=FIX.4.4, 9=359, 35=G, 34=25355] 2
CodePudding user response:
Assuming a list of string, the most efficient is likely to use a list comprehension:
df['idx'] = [next((x for x in range(len(l)) if '35=' in l[x]), None)
for l in df['split']]
output:
sessionId split idx
0 117200 [8=FIX.4.4, 9=401, 35=F, 34=342375] 2
1 117200 [8=FIX.4.4, 9=454, 35=G, 34=342374] 2
2 117200 [8=FIX.4.4, 9=430, 35=G, 34=342373] 2
3 173335 [8=FIX.4.4, 9=444, 35=G, 34=272236] 2
4 133911 [8=FIX.4.4, 9=359, 35=G, 34=25355] 2
used input:
df = pd.DataFrame({'sessionId': [117200, 117200, 117200, 173335, 133911],
'split': [['8=FIX.4.4', '9=401', '35=F', '34=342375'],
['8=FIX.4.4', '9=454', '35=G', '34=342374'],
['8=FIX.4.4', '9=430', '35=G', '34=342373'],
['8=FIX.4.4', '9=444', '35=G', '34=272236'],
['8=FIX.4.4', '9=359', '35=G', '34=25355']]})