I'm trying to create a column based on some specific "rules". I'd like to have a new column at the end "Result" with the following result based on the three first columns:
is_return | From | To | Result |
---|---|---|---|
True | Fir | Fem | FirFem |
False | Tre | Syv | TreSyv |
True | Syv | Tre | TreSyv_r |
False | Tre | Syv | TreSyv2 |
True | Syv | Tre | TreSyv_r2 |
False | Snø | Van | SnøVan |
Basically if theres a trip that is not a return then just combine from and to, and add a number starting from 2 to it if there are multiple (row 4). If it's tagged as a return trip then first check if it exists as a non-return trip, example given in row 2 and 3. But if it's tagged as a return without there existing a non-return variant then keep this format as is (row 1).
CodePudding user response:
IIUC, you need several steps (commented in the code):
import numpy as np
# compute the string for both directions
s1 = df['From'] df['To']
s2 = df['To'] df['From']
# compute the string is the correct order
# depending on the existence of the first trip
s = pd.Series(np.where(s2.isin(s1[~df['is_return']]), s2 '_r', s1),
index=df.index)
# add number to duplicates
count = s.groupby(s).cumcount().add(1)
df['Result'] = s np.where(count.gt(1), count.astype(str), '')
output:
is_return From To Result
0 True Fir Fem FirFem
1 False Tre Syv TreSyv
2 True Syv Tre TreSyv_r
3 False Tre Syv TreSyv2
4 True Syv Tre TreSyv_r2
5 False Snø Van SnøVan