I have a sample dataframe as follows,
import pandas as pd
import datetime
start = datetime.datetime.now()
print('Starting time,' str(start))
dict1 = {'id':['person1','person2','person3'], \
'person1':['A','','', ], \
'person2':['B','E',''], \
'person3':['C','F','G',], }
demo = pd.DataFrame(dict1)
demo
demo.set_index(["id"], inplace=True)
demo
Out:
person1 person2 person3
person1 A B C
person2 E F
person3 G
My target format is as follows,
target_1 target_2 target
0 person1 person1 A
1 person1 person2 B
2 person1 person3 C
3 person2 person1 E
4 person2 person2 F
5 person3 person3 G
Now I want to convert it into a triple, and I can think of using a loop to iterate over each element, but I think this may not be efficient, especially for large data sets, since I am new to python, so I want to find a more efficient way.
Any help is appreciated.
CodePudding user response:
Don't use loop. Let's do stack
to reshape the dataframe
demo[demo != ''].stack().rename_axis(['target_1', 'target_2']).reset_index(name='target')
Result
target_1 target_2 target
0 person1 person1 A
1 person1 person2 B
2 person1 person3 C
3 person2 person2 E
4 person2 person3 F
5 person3 person3 G
CodePudding user response:
for i in demo.index.tolist():
for j in demo.columns.values.tolist():
print(demo.loc[i][j])