Home > OS >  Creating a dataframe from different lists
Creating a dataframe from different lists

Time:11-26

I am new to python, so the question could be trivial.

I have a pair of lists, containing solid names and the associated counts, of which I am providing a sample here below:

volumes1 = ['Shield', 'Side', 'expHall', 'Funnel', 'gridpiece']

counts1= [3911, 1479, 553, 368, 342]

and a second pair of lists

volumes2 = ['Shield', 'leg', 'Funnel', 'gridpiece','wafer']

counts2= [291, 469, 73, 28, 32]

Notice that not all the volumes are present in each list, and their position can be different.

What I would like to obtain is a dataframe where the first column comprehends all the volumes in volume1 and volume2, the second columns is all the corresponding values in counts1, and the third column is all the corresponding values in counts2.

If a volume in the first column of the dataframe is not present in volume1 the corresponding value in the second column is set to 0, and in the same way if a volume in the first column of the dataframe is not present in volume2 the corresponding value in the third column is set to 0, so that the final output for the values I provided would be:

| volumes | counts1 | counts2 |

| Shield | 3911 | 291 |

| Side | 1479 | 0 |

| expHall | 553 | 0 |

| Funnel | 368 | 73 |

| gridpiece | 342 | 28 |

| leg | 0 | 469 |

| wafer | 0 | 32 |

I am not so experienced in python and I have been struggling a lot with no results, is there any way to obtain what I want in a quick and elegant way?

Thanks

CodePudding user response:

quess not optimal, but one solution

import pandas as pd

volumes1 = ['Shield', 'Side', 'expHall', 'Funnel', 'gridpiece']
counts1= [3911, 1479, 553, 368, 342]
volumes2 = ['Shield', 'leg', 'Funnel', 'gridpiece','wafer']
counts2= [291, 469, 73, 28, 32]

volumes12=list(set(volumes1 volumes2))
counts1R=[0]*len(volumes12)
counts2R=[0]*len(volumes12)

for x in range(0,len(volumes1)):
    p=list(volumes12).index(volumes1[x])
    counts1R[p]=counts1[x]
for x in range(0,len(volumes2)):
    p=list(volumes12).index(volumes2[x])
    counts2R[p]=counts2[x]

d={   'volumes':volumes12,
      'counts1':counts1R,
      'counts2':counts2R
    }
df = pd.DataFrame(data=d)
print(df)


see https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

  • Related