Basically, I want to move the second row of my data frame to be the first elements of a new set of columns.
I have a data frame,
**Topics** **co-authors**
Object Detection; Deep Learning; IOU Bandala, Argel A.
Character Recognition; Tesseract; Number Vicerra, Ryan Rhay P.
Robot; End Effectors; Malus Concepcion, Ronnie
Crops; Plant Diseases and Disorders; Beriberi Sybingco, E.
Swarm Robotics; Swarm; Social Insects Billones, Robert Kerwin C.
and I want a new data frame to have columns as follows,
| Topic_1 | Topic_2 | Topic_3 | Topic_4 | Topic_5 | Coauthor_1 | Coauthor_2 | Coauthor_3 | Coauthor_4 | Coauthor_5 |
How do I do that?
Thank you in advance.
CodePudding user response:
The question is ambiguous, but assuming you want to perform one-hot encoding on the two columns:
out = (df['Topics'].str.get_dummies(sep='; ')
.join(df['co-authors'].str.get_dummies(sep='; '))
)
Output:
Beriberi Character Recognition Crops Deep Learning End Effectors IOU Malus Number Object Detection Plant Diseases and Disorders Robot Social Insects Swarm Swarm Robotics Tesseract Bandala, Argel A. Billones, Robert Kerwin C. Concepcion, Ronnie Sybingco, E. Vicerra, Ryan Rhay P.
0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0
1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1
2 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0
3 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0
4 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0
CodePudding user response:
.Sorry for the ambiguity. What I've got are a number of files each containing 2 columns of topics and co-authors, and I want 2 topics and 2 co-authors to be a row in a new data frame. Let's say I pick up a file that looks like this,
Topics | Co-authors |
---|---|
A | Mark |
B | James |
, and another file that looks like this, | Topics | Co-authors | | -------- | -------------- | | Aa | Zu | | Bb | Ken |
and I want to create a new data frame that looks like this, | Topics1 | Topics2 |Co-authors1|Co-authors2| | --------| --------|---------- |---------- | | A | B | Mark | James | | Aa | Bb | Zu | Ken | .