How to convert cells in a row from a dataframe to a dictionary using a loop on Python? Pandas relate-CodePudding

Let's say I have the following df:

       0      0      1               1    2       2     3        3      4    4    5     5         
0  Fondo Oceano Cuerpo Cuerpo cangrejo Ojos Antenas Color Amarillo Pinzas None Puas None            
1  Fondo Oceano Cuerpo Cuerpo cangrejo Ojos Antenas Color Amarillo Pinzas None Puas Arena     
2  Fondo Oceano Cuerpo Cuerpo cangrejo Ojos Antenas Color Amarillo Pinzas None Puas Marron    
3  Fondo Oceano Cuerpo Cuerpo cangrejo Ojos Antenas Color Amarillo Pinzas None Puas Purpura    
4  Fondo Oceano Cuerpo Cuerpo cangrejo Ojos Antenas Color Amarillo Pinzas None Puas Verde

I know I can use Series.iteritems this way to iterate over a particular row in this df and print the content of each cell in a particular row (ignoring the index column):

row = 0 #desired row
for _, e in df.iloc[row].iteritems():
    print(e)

Output:

Fondo

Oceano

Cuerpo

Cuerpo cangrejo

Ojos

Antenas

Color

Amarillo

Pinzas

None

Puas

None

But what I need like to learn now is how could I improve the loop above so that it creates a dictionary that has even cells as keys and odd cells as values respectively?

In other words, how could I get the following dictionary for the 0 row as output?

the_dic = { 'Fondo':'Oceano',
            'Cuerpo': 'Cuerpo cangrejo',
            'Ojos': 'Antenas',
            'Color': 'Amarillo',
            'Pinzas': 'None',
            'Puas': 'None'
          }

PS: The 'None' element in this case is a str value and not the object None

CodePudding user response：

EDIT: Solution working if 2 duplicated values in columns names like in sample data:

print (df.columns)
Int64Index([0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5], dtype='int64')

You can loop by indices with convert first and second values in dict comprehension:

row = 0
d = {x.iat[0]: x.iat[1] for name, x in df.iloc[row].groupby(level=0)}
print (d)
{'Fondo': 'Oceano', 'Cuerpo': 'Cuerpo cangrejo', 'Ojos': 'Antenas', 'Color': 'Amarillo', 'Pinzas': 'None', 'Puas': 'None'}

Or filter first and last indices and add zip with dict:

row = 0
s = df.iloc[row]

d = dict(zip(s[~s.index.duplicated()], s[~s.index.duplicated(keep='last')]))
print (d)
{'Fondo': 'Oceano', 'Cuerpo': 'Cuerpo cangrejo', 'Ojos': 'Antenas', 'Color': 'Amarillo', 'Pinzas': 'None', 'Puas': 'None'}

For testing:

s = pd.Series(['Fondo', 'Oceano', 'Cuerpo', 'Cuerpo cangrejo', 'Ojos', 
               'Antenas', 'Color', 'Amarillo', 'Pinzas', 'None', 'Puas', 'None'],
              index=[0,0,1,1,2,2,3,3,4,4,5,5])
print (s)
0              Fondo
0             Oceano
1             Cuerpo
1    Cuerpo cangrejo
2               Ojos
2            Antenas
3              Color
3           Amarillo
4             Pinzas
4               None
5               Puas
5               None
dtype: object

d = dict(zip(s[~s.index.duplicated(keep='last')], s[~s.index.duplicated()]))
print (d)
{'Oceano': 'Fondo', 'Cuerpo cangrejo': 'Cuerpo', 'Antenas': 'Ojos', 'Amarillo': 'Color', 'None': 'Puas'}