I have a data-frame given below like this:
ID1 ID2 Yr-Month Class
1 p1 Feb-19 PE5->SC
1 p2 Feb-19 SC
1 p3 Feb-19 EA->PE5
1 p4 Feb-19 EA->PE5->SC
1 p5 Feb-19 PC->PE5->SC
I need to create a new column called Final in existing data where if transition is happening from PE5 to SC i.e. PE5->SC in following ways PE5->SC,EA->PE5->SC,PC->PE5->SC Final column value needs to be one else 0 given below is the expected output:
ID1 ID2 Yr-Month Class Final
1 p1 Feb-19 PE5->SC 1
1 p2 Feb-19 SC 0
1 p3 Feb-19 EA->PE5 0
1 p4 Feb-19 EA->PE5->SC 1
1 p5 Feb-19 PC->PE5->SC 1
please help
CodePudding user response:
Test substrings by Series.str.contains
and for convert True/False
to 1/0
convert values to integers:
df['Final'] = df['Class'].str.contains('PE5.*SC').astype(int)
Alternative with numpy.where
:
df['Final'] = np.where(df['Class'].str.contains('PE5.*SC'), 1, 0)
print (df)
ID1 ID2 Yr-Month Class Final
0 1 p1 Feb-19 PE5->SC 1
1 1 p2 Feb-19 SC 0
2 1 p3 Feb-19 EA->PE5 0
3 1 p4 Feb-19 EA->PE5->SC 1
4 1 p5 Feb-19 PC->PE5->SC 1
CodePudding user response:
You can create a 'Final' column of 0s, then use .loc to find the rows where Class contains 'PE5->' using df['Class'].str.contains('PE5->')
, and then set the corresponding values in 'Final' to 1
df['Final'] = 0
df.loc[df['Class'].str.contains('PE5->'), 'Final'] = 1
Output:
ID1 ID2 Yr-Month Class Final
1 p1 Feb-19 PE5->SC 1
1 p2 Feb-19 SC 0
1 p3 Feb-19 EA->PE5 0
1 p4 Feb-19 EA->PE5->SC 1
1 p5 Feb-19 PC->PE5->SC 1
EDIT: Having seen OP's comment on Jezrael's answer, it appears there was a missing case from the original question. The code to cover the updated case would be:
df['Final'] = 0
df.loc[df['Class'].str.contains('PE5->.*SC', regex=True), 'Final'] = 1
Output:
ID1 ID2 Yr-Month Class Final
1 p1 Feb-19 PE5->SC 1
1 p2 Feb-19 SC 0
1 p3 Feb-19 EA->PE5 0
1 p4 Feb-19 EA->PE5->SC 1
1 p5 Feb-19 PE5->PC->SC 1
CodePudding user response:
Another interesting way would be to use replace
df['Final'] = df['Class'].replace(['PE5.*SC', '.'], [1, 0], regex=True)
Output:
ID1 | ID2 | Yr-Month | Class | Final |
---|---|---|---|---|
1 | p1 | Feb-19 | PE5->SC | 1 |
1 | p2 | Feb-19 | SC | 0 |
1 | p3 | Feb-19 | EA->PE5 | 0 |
1 | p4 | Feb-19 | EA->PE5->SC | 1 |
1 | p5 | Feb-19 | PC->PE5->SC | 1 |