I have a following list of tuples: data = [[('abc', 'type1'), ('def', 'type2'), ('ghi', 'type3')], [], [('jkl', 'type4')], [('mno', 'type1'), ('pqr', 'type3')]]
Is it possible to put this into a dataframe using the second item as column header? Desired output:
type1 type2 type3 type4
0 abc def ghi NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN jkl
3 mno NaN pqr NaN
CodePudding user response:
This should work:
df = pd.DataFrame(data=None,columns=['type1','type2','type3','type4'])
data = [[('abc', 'type1'), ('def', 'type2'), ('ghi', 'type3')], [], [('jkl', 'type4')], [('mno', 'type1'), ('pqr', 'type3')]]
for n in data:
datain = [np.NaN, np.NaN, np.NaN, np.NaN]
for a,b in n:
if b == 'type1':
datain[0] = a
elif b == 'type2':
datain[1] = a
elif b == 'type3':
datain[2] = a
elif b == 'type4':
datain[3] = a
df2 = pd.DataFrame(data=[datain],columns=['type1','type2','type3','type4'])
df = pd.concat([df,df2])