Home > Enterprise >  Is there a way to make from four columns one column based on several conditions in python?
Is there a way to make from four columns one column based on several conditions in python?

Time:06-15

The problem is that I can not make one column out of four based on my conditions. Programming language - python.

#for demonstration
import pandas as pd

example = {
"color ID": [1, 2,3, 4, 5],
    "blue colored": ["blue", "not blue", "blue", "not blue", "not blue"],
  "red colored":  ["not red", "red", "not red", "not red", "not red"],
"green colored": ["not green", "not green", "not green", "green", "green"]
}

#load into df:
example = pd.DataFrame(example)

print(example) 

What I want to do is to create a new column (color), which will look like in the example below and give the information, which ID has what kind of color.

#for demonstration
import pandas as pd

expected_result = {
    "color ID": [1, 2,3, 4, 5],
  "blue colored": ["blue", "not blue", "blue", "not blue", "not blue"],
  "red colored":  ["not red", "red", "not red", "not red", "not red"],
"green colored": ["not green", "not green", "not green", "green", "green"],
"color": ["blue", "red", "blue", "green", "green"]
}

#load into df:
expected_result = pd.DataFrame(expected_result)
print(expected_result) 

I wonder if there is a way to make it. Thank you very much!

CodePudding user response:

import pandas as pd

example = {
"color ID": [1, 2,3, 4, 5],
    "blue colored": ["blue", "not blue", "blue", "not blue", "not blue"],
  "red colored":  ["not red", "red", "not red", "not red", "not red"],
"green colored": ["not green", "not green", "not green", "green", "green"]
}

#load into df:
example = pd.DataFrame(example)

example.loc[example['blue colored'] == "blue", 'color'] = 'blue' 
example.loc[example['red colored'] == "red", 'color'] = 'red' 
example.loc[example['green colored'] == "green", 'color'] = 'green' 

output

enter image description here

CodePudding user response:

You can use a stack and filter:

example['color'] = (example
 .filter(like='colored').stack()
 .loc[lambda x: ~x.str.startswith('not ')]
 .droplevel(1) # if you had many possibilities: .groupby(level=0).first()
)

output:

   color ID blue colored red colored green colored  color
0         1         blue     not red     not green   blue
1         2     not blue         red     not green    red
2         3         blue     not red     not green   blue
3         4     not blue     not red         green  green
4         5     not blue     not red         green  green

CodePudding user response:

colors = ['red', 'green', 'blue']
for color in colors:
    example.loc[example[color   ' colored'] == color, 'color'] = color
  • Related