Home > Mobile >  Looking for an efficient way to convert a dataframe (or matrix) to a triplet
Looking for an efficient way to convert a dataframe (or matrix) to a triplet

Time:12-05

I have a sample dataframe as follows,

import pandas as pd
import datetime

start = datetime.datetime.now()
print('Starting time,' str(start))
dict1 = {'id':['person1','person2','person3'], \
         'person1':['A','','', ], \
         'person2':['B','E',''], \
         'person3':['C','F','G',], }
demo = pd.DataFrame(dict1)
demo
demo.set_index(["id"], inplace=True)
demo


Out:
         
        person1 person2 person3 
person1    A       B       C
person2            E       F
person3                    G

My target format is as follows,

    target_1    target_2    target
0   person1     person1       A
1   person1     person2       B
2   person1     person3       C
3   person2     person1       E
4   person2     person2       F
5   person3     person3       G

Now I want to convert it into a triple, and I can think of using a loop to iterate over each element, but I think this may not be efficient, especially for large data sets, since I am new to python, so I want to find a more efficient way.

Any help is appreciated.

CodePudding user response:

Don't use loop. Let's do stack to reshape the dataframe

demo[demo != ''].stack().rename_axis(['target_1', 'target_2']).reset_index(name='target')

Result

  target_1 target_2 target
0  person1  person1      A
1  person1  person2      B
2  person1  person3      C
3  person2  person2      E
4  person2  person3      F
5  person3  person3      G

CodePudding user response:

for i in demo.index.tolist():
    for j in demo.columns.values.tolist():
        print(demo.loc[i][j])
  • Related