Home > OS >  Modifying pandas dataframe colums based on conditionals
Modifying pandas dataframe colums based on conditionals

Time:02-02

I am trying to modify the values in columns of a pandas DataFrame based on conditionals. This answer: https://stackoverflow.com/a/50779719/1112097 is close, but the conditionals used are too simple for my use case, which uses a dictionary of lists in the conditional

Consider a Dataframe of individuals and their location:

owners = pd.DataFrame([['John', 'North'], 
  ['Sara', 'South'], 
  ['Seth', 'East'], 
  ['June', 'West']],
  columns=['Who','Location'])
owners

output:

Who Location
0 John North
1 Sara South
2 Seth East
3 June West

The dictionary contains lists of locations where a type of pet can go:

pets = {
  'Cats': ['North', 'South'],
  'Dogs': ['East', 'North'],
  'Birds': ['South', 'East']}
pets

output: {'Cats': ['North', 'South'], 'Dogs': ['East', 'North'], 'Birds': ['South', 'East']}

I need to add a column in the owners DateFrame for each pet type that says yes or no based on the presence of the location in the dictionary lists

In this example, the final table should look like this:

Who Location Cats Dogs Birds
0 John North Yes Yes No
1 Sara South Yes No Yes
2 Seth East No Yes Yes
3 June West No No No

This fails

for pet in pets:
  owners[pet] = 'Yes' if owners['Location'] in pets[pet] else 'No'

With the following error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I understand that the error comes from the fact that owners['Location'] is a series not an individual value in a row, but I don't know the proper way to apply this kind of conditional across the rows of a DataFrame.

CodePudding user response:

Lets do isin

for k, v in pets.items():
    owners[k] = owners['Location'].isin(v)

    Who Location   Cats   Dogs  Birds
0  John    North   True   True  False
1  Sara    South   True  False   True
2  Seth     East  False   True   True
3  June     West  False  False  False

CodePudding user response:

You can also use .isin() and .map()

for pet in pets:
    owners[pet] = owners["Location"].isin(pets[pet]).map({True: "Yes", False: "No"})
print(owners)

    Who Location Cats Dogs Birds
0  John    North  Yes  Yes    No
1  Sara    South  Yes   No   Yes
2  Seth     East   No  Yes   Yes
3  June     West   No   No    No

CodePudding user response:

you need to iterate through both the key and value of pets, see below.

for k,v in pets.values():
    owners[k] = owners['Location'].apply(lambda x: 'Yes' if x in v else 'No')

will output:

    Who Location North East South
0  John    North    No  Yes    No
1  Sara    South   Yes   No    No
2  Seth     East    No   No   Yes
3  June     West    No   No    No

CodePudding user response:

You can use apply:

for pet, locs in pets.items():
  owners[pet] = owners['Location'].apply(lambda l: 'Yes' if l in locs else 'No') 

CodePudding user response:

Some of these other answers are probably faster, but here is a way by switching the dictionary keys and values.

d = {i:[] for i in set([j for i in list(pets.values()) for j in i])}

for k,v in pets.items():
    for i in v:
        d.get(i).append(k)
        
owners.join(owners['Location'].map(d).str.join('|').str.get_dummies().replace({1:'Yes',0:'No'}))

Output:

    Who Location Birds Cats Dogs
0  John    North    No  Yes  Yes
1  Sara    South   Yes  Yes   No
2  Seth     East   Yes   No  Yes
3  June     West    No   No   No
  • Related