I am trying to turn multiple columns into a single column, without the "null" ones and keep the identifier of each row
Identifier | Column1 | Column2 | Column3 | Column4 |
---|---|---|---|---|
1 | Dog | Cow | Sheep | Dinosaur |
2 | Dog | Pig | ||
3 | Bull | Cow | Elephant |
I want the new 2 columns like this, the original dataframe might have lots of columns, 20, 30, maybe more.
Identifier | Var |
---|---|
1 | Dog |
1 | Cow |
1 | Sheep |
1 | Dinosaur |
2 | Dog |
2 | Pig |
3 | Bull |
3 | Cow |
3 | Elephant |
CodePudding user response:
You can use melt()
for this
df.melt(id_vars = 'Identifier')[['Identifier', 'value']].dropna().sort_values('Identifier')