Home > Net >  ValueError: Expected object or value when reading json.gzip to DataFrame
ValueError: Expected object or value when reading json.gzip to DataFrame

Time:02-20

I wanted to read the Electronics json.gzip file from the list of available Amazon datasets: http://jmcauley.ucsd.edu/data/amazon/qa/

JSON sample:

{'questionType': 'yes/no', 'asin': 'B00004U9JP', 'answerTime': 'Jun 27, 2014', 'unixTime': 1403852400, 'question': 'I have a 9 year old Badger 1 that needs replacing, will this Badger 1 install just like the original one?', 'answerType': '?', 'answer': 'I replaced my old one with this without a hitch.'}
{'questionType': 'open-ended', 'asin': 'B00004U9JP', 'answerTime': 'Apr 28, 2014', 'unixTime': 1398668400, 'question': 'model number', 'answer': 'This may help InSinkErator Model BADGER-1: Badger 1 1/3 HP Garbage Disposal PRODUCT DETAILS - Bellacor Number:309641 / UPC:050375000419 Brand SKU:500181'}
{'questionType': 'yes/no', 'asin': 'B00004U9JP', 'answerTime': 'Aug 25, 2014', 'unixTime': 1408950000, 'question': 'can I replace Badger 1 1/3 with a Badger 5 1/2 - with same connections?', 'answerType': '?', 'answer': 'Plumbing connections will vary with different models. Usually the larger higher amp draw wil not affect the wiring, the disposals are designed to a basic standard setup common to all brands. They want you to buy their brand or version or model. As long as the disposal is UL listed, United Laboratories, they will setup and bolt up the same.'}
{'questionType': 'yes/no', 'asin': 'B00004U9JP', 'answerTime': 'Nov 3, 2014', 'unixTime': 1415001600, 'question': 'Does this come with power cord and dishwasher hook up?', 'answerType': '?', 'answer': 'It does not come with a power cord. It does come with the dishwasher hookup.'}
{'questionType': 'open-ended', 'asin': 'B00004U9JP', 'answerTime': 'Jun 21, 2014', 'unixTime': 1403334000, 'question': 'loud noise inside when turned on. sounds like blades are loose', 'answer': 'Check if you dropped something inside.Usually my wife put lemons inside make a lot of noise and I will have to get them out using my hands or mechanical fingers .'}
{'questionType': 'open-ended', 'asin': 'B00004U9JP', 'answerTime': 'Jul 13, 2013', 'unixTime': 1373698800, 'question': 'where is the reset button located', 'answer': 'on the bottom'}

My current code uses the pd.read_json method with specified lines and orient parameters, however changing these doesn't seem to work.

electronics_url = 'http://jmcauley.ucsd.edu/data/amazon/qa/qa_Electronics.json.gz'
electronics_df = pd.read_json(electronics_url, orient='split', lines=True, compression='gzip')

I get the ValueError: Expected object or value. I tried all possible variations of the orient parameter, but it does not help. I also tried to open the file from a local buffer, unfortunately with no success.

What is the problem?

CodePudding user response:

The content of the archive is not JSON valid. Each row of the file looks like a Python dict. You can use this snippet:

import gzip
import ast
import urllib

data = []
url = 'http://jmcauley.ucsd.edu/data/amazon/qa/icdm/QA_Baby.json.gz'

with urllib.request.urlopen(url) as r:
    for qa in gzip.open(r):
        data.append(ast.literal_eval(qa.decode('utf-8')))

After that, use pd.json_normalize to read the list of dict:

answers = pd.json_normalize(data, ['questions', 'answers'])
print(answers)

# Output
                                              answerText      answererID          answerTime helpful answerType answerScore
0      Yes, the locks will keep adults out too.  My h...  A2WQX54BDMJTKY    November 6, 2013  [1, 1]        NaN         NaN
1      Yes if you install it correctly.  a lot of fol...  A3VRA4069D8C7L    November 6, 2013  [0, 0]        NaN         NaN
2      It probably will...  it's pretty good and much...   A3JEFPEUXUS0I    November 6, 2013  [0, 0]        NaN         NaN
3      The size of the locking mechanism. I bought th...  A1OCJ9L2PQJBUD    January 12, 2015  [0, 0]        NaN         NaN
4        The locking mechanism unlocks with the magnet .  A2KGWT9ZN4M1PO    January 14, 2015  [0, 0]        NaN         NaN
...                                                  ...             ...                 ...     ...        ...         ...
82029  I feel it would work fine for the 4 year old. ...  A2BIFRN88PPMGT  September 17, 2014  [1, 1]          Y      0.9828
82030  In my opinion, the pillow was slightly bigger ...   AHM5QX41VSV6B  September 17, 2014  [0, 0]          ?      0.9411
82031  Our 2yo is a belly sleeper too. At first she w...   AKW750RUMWK17     August 28, 2014  [1, 1]        NaN         NaN
82032  Hi. Yes, the pillow will settle with use for s...  A1XQAY39M2KOL0     August 27, 2014  [0, 0]        NaN         NaN
82033  I would recommend contacting the company to se...  A1ZCGIRS68DM9J     August 28, 2014  [0, 0]        NaN         NaN

[82034 rows x 6 columns]
  • Related