How do I create a list of items (attributes) based on the dummy variable value?-CodePudding

Let's say I have a dataframe in python with a range of animals, and a range of attributes, with dummy variables for whether the animal has that attribute. I'm interested in creating lists, both vertically and horizontally based on dummy variable value. e.g. I'd like to:

a) create a list of animals that have hair
b) create a list of all the attributes that a dog has.

Could anyone please assist with how I would do this in Python? Thanks very much!

Name	Hair	Eyes
Dog	1	1
Fish	0	1

CodePudding user response：

You could use a dictionary to store values regarding the animals. And the first value of the values list can hold the 0 or 1 denoting hair on the animal.

animals = { "Dog": [ 1, 1 ], "Fish": [ 0, 1 ] }

CodePudding user response：

(a)

df[ df['Hair'] == 1 ]['Name'].to_list() 

df.loc[ df['Hair'] == 1, 'Name'].to_list()

(b)

It may need to transpose dataframe (to convert rows into columns) and set column's names.

And later you can use similar code

df[ df['Dog'] == 1 ].index.to_list()

Minimal working code

text = '''Name,Hair,Eyes
Dog,1,1
Fish,0,1'''

import pandas as pd
import io

df = pd.read_csv(io.StringIO(text))
print(df)
print('---')

print('Hair 1:', df[ df['Hair'] == 1 ]['Name'].to_list())
print('hair 2:', df.loc[ df['Hair'] == 1, 'Name'].to_list())

print('---')
# transpose
#new_df = df.transpose()  # 
new_df = df.T             # shorter name - without `()`

# convert first row into column's names
new_df.columns = new_df.loc['Name']
new_df = new_df[1:]

print(new_df)
print('---')

print('Dog :', new_df[ new_df['Dog'] == 1 ].index.to_list())
print('Fish:', new_df[ new_df['Fish'] == 1 ].index.to_list())

Result:

   Name  Hair  Eyes
0   Dog     1     1
1  Fish     0     1
---
Hair 1: ['Dog']
hair 2: ['Dog']
---
Name Dog Fish
Hair   1    0
Eyes   1    1
---
Dog : ['Hair', 'Eyes']
Fish: ['Eyes']