I am trying to convert a dict to Pandas DataFrame as the following:
dff = pd.DataFrame(
{
'CEO': 'ucMMe Mhll',
'address': 'vs5dlt3 B Se1kC eve0nre',
'address2': '-',
'city': 'a CSatanral',
'companyName': 'Agilent Technologies Inc.',
'country': 'nUatei tdetSs',
'description': "tns oo el' yty",
'employees': 17124,
'exc': 'gdgdgd',
'industry': 'sgeiTeotiroaLbtans r',
'issueType': 'abc',
'phone': '14087832319',
'primarySicCode': 4008,
'sector': ',atnSii Scilcofe,nnse TecisaPliinafs cedorhv cre',
'securityName': 'elooIne.nen htc iisTcgAgl',
'state': 'ailairofnC',
'symbol': 'A',
'tags': ['nllh he', 'gth', 'acsl', 'isiad', 'nr aitT'],
'website': 'win.gcm.',
'zip': '0752501-19'} )
And when I print out the DataFrame, I see the following output:
print(dff)
I expect to see 1 row only in the DataFrame but it gives 5. And I cannot understand why. What am I doing wrong here?
CodePudding user response:
You're not doing anything wrong. Since tags
is a list, Pandas broadcasts all other fields to same size as tags
and make a dataframe. You can do:
pd.Series(your_dict).to_frame().T
Or wrap your dict around []
indicating it's a row (record orient):
pd.DataFrame([your_dict])
CodePudding user response:
This is because your tags row has 5, so it tries to 'fill in the blanks for the rest'. To fix this, put a second layer of brackets around it, so it treats it as one row, not 5.
dff = pd.DataFrame(
{
'CEO': 'ucMMe Mhll',
'address': 'vs5dlt3 B Se1kC eve0nre',
'address2': '-',
'city': 'a CSatanral',
'companyName': 'Agilent Technologies Inc.',
'country': 'nUatei tdetSs',
'description': "tns oo el' yty",
'employees': 17124,
'exc': 'gdgdgd',
'industry': 'sgeiTeotiroaLbtans r',
'issueType': 'abc',
'phone': '14087832319',
'primarySicCode': 4008,
'sector': ',atnSii Scilcofe,nnse TecisaPliinafs cedorhv cre',
'securityName': 'elooIne.nen htc iisTcgAgl',
'state': 'ailairofnC',
'symbol': 'A',
'tags': [['nllh he', 'gth', 'acsl', 'isiad', 'nr aitT']], # Double brackets to indicate 1 cell
'website': 'win.gcm.',
'zip': '0752501-19'} )
CodePudding user response:
You could wrap each dictionary value in a list:
dff = pd.DataFrame({k: [v] for k,v in dct.items()})
>>> dff
CEO address ... website zip
0 ucMMe Mhll vs5dlt3 B Se1kC eve0nre ... win.gcm. 0752501-19
[1 rows x 20 columns]