This is my dataset:
| Name | Dept | Project area/areas interested |
| -------- | -------- |-----------------------------------|
| Joe | Biotech | Cell culture//Bioinfo//Immunology |
| Ann | Biotech | Cell culture |
| Ben | Math | Trigonometry//Algebra |
| Keren | Biotech | Microbio |
| Alice | Physics | Optics |
This is how I want my result:
| Name | Dept |Cell culture|Bioinfo|Immunology|Trigonometry|Algebra|Microbio|Optics|
| -------- | -------- |------------|-------|----------|------------|-------|--------|------|
| Joe | Biotech | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
| Ann | Biotech | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| Ben | Math | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
| Keren | Biotech | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| Alice | Physics | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
Not only do I have to split the last column into the different columns based on the rows - I have to resplit certain column values that are seperated by "//". And the values in the dataframe have to be replaced with 1 or 0 (int). I've been stuck on this for a while now (-_-;)
CodePudding user response:
You can use pandas.concat in combination with pandas.get_dummies like this:
pd.concat([df[["Name", "Dept"]], df["Project area/areas interested"].str.get_dummies(sep='//')], axis=1)