Home > database >  JSON pandas dataframe ValueError: Expected object or value
JSON pandas dataframe ValueError: Expected object or value

Time:11-30

I am trying to read a JSON file using pandas. The JSON file is in this format:

{
    "category": "CRIME", 
    "headline": "There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV", 
    "authors": "Melissa Jeltsen", 
    "link": "https://www.huffingtonpost.com/entry/texas-amanda-painter-mass-shooting_us_5b081ab4e4b0802d69caad89", "short_description": "She left her husband. He killed their children. Just another day in America.", 
    "date": "2018-05-26"
}
{
    "category": "ENTERTAINMENT", 
    "headline": "Will Smith Joins Diplo And Nicky Jam For The 2018 World Cup's Official Song", 
    "authors": "Andy McDonald", 
    "link": "https://www.huffingtonpost.com/entry/will-smith-joins-diplo-and-nicky-jam-for-the-official-2018-world-cup-song_us_5b09726fe4b0fdb2aa541201", 
    "short_description": "Of course, it has a song.", 
    "date": "2018-05-26"
}

However, I get the following error that I don't understand why:

ValueError                                Traceback (most recent call last)
/var/folders/j6/rj901v4j40368zfdw64pbf700000gn/T/ipykernel_11792/4234726591.py in <module>
----> 1 df = pd.read_json('db.json', lines=True)
      2 df.head()

~/opt/anaconda3/lib/python3.9/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    205                 else:
    206                     kwargs[new_arg_name] = new_arg_value
--> 207             return func(*args, **kwargs)
    208 
    209         return cast(F, wrapper)

~/opt/anaconda3/lib/python3.9/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    309                     stacklevel=stacklevel,
    310                 )
--> 311             return func(*args, **kwargs)
    312 
    313         return wrapper

~/opt/anaconda3/lib/python3.9/site-packages/pandas/io/json/_json.py in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, encoding_errors, lines, chunksize, compression, nrows, storage_options)
    610 
    611     with json_reader:
--> 612         return json_reader.read()
    613 
    614 

~/opt/anaconda3/lib/python3.9/site-packages/pandas/io/json/_json.py in read(self)
    742                 data = ensure_str(self.data)
    743                 data_lines = data.split("\n")
--> 744                 obj = self._get_object_parser(self._combine_lines(data_lines))
    745         else:
    746             obj = self._get_object_parser(self.data)

~/opt/anaconda3/lib/python3.9/site-packages/pandas/io/json/_json.py in _get_object_parser(self, json)
    766         obj = None
    767         if typ == "frame":
--> 768             obj = FrameParser(json, **kwargs).parse()
    769 
    770         if typ == "series" or obj is None:

~/opt/anaconda3/lib/python3.9/site-packages/pandas/io/json/_json.py in parse(self)
    878             self._parse_numpy()
    879         else:
--> 880             self._parse_no_numpy()
    881 
    882         if self.obj is None:

~/opt/anaconda3/lib/python3.9/site-packages/pandas/io/json/_json.py in _parse_no_numpy(self)
   1131         if orient == "columns":
   1132             self.obj = DataFrame(
-> 1133                 loads(json, precise_float=self.precise_float), dtype=None
   1134             )
   1135         elif orient == "split":

ValueError: Expected object or value

My code is written as follows:

import pandas as pd

df = read_json('db.json', lines=True)
df.head()

I tried changing the structure of the JSON file as suggested by here but it doesn't work. The error that I get is the same error as the one I have specified above. Is there any other way that i can solve this issue?

CodePudding user response:

You can wrap it in square brackets [] and add a comma between the dictionaries for valid json.

[{
    "category": "CRIME",
    "headline": "There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV",
    "authors": "Melissa Jeltsen",
    "link": "https://www.huffingtonpost.com/entry/texas-amanda-painter-mass-shooting_us_5b081ab4e4b0802d69caad89", "short_description": "She left her husband. He killed their children. Just another day in America.",
    "date": "2018-05-26"
},
{
    "category": "ENTERTAINMENT",
    "headline": "Will Smith Joins Diplo And Nicky Jam For The 2018 World Cup's Official Song",
    "authors": "Andy McDonald",
    "link": "https://www.huffingtonpost.com/entry/will-smith-joins-diplo-and-nicky-jam-for-the-official-2018-world-cup-song_us_5b09726fe4b0fdb2aa541201",
    "short_description": "Of course, it has a song.",
    "date": "2018-05-26"
}]

Read file::

import pandas as pd


df = pd.read_json("/path/to/file/db.json")
print(df)

Output:

        category                                                                     headline          authors                                                                                                 link                                                             short_description       date
0          CRIME             There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV  Melissa Jeltsen  https://www.huffingtonpost.com/entry/texas-amanda-painter-mass-shooting_us_5b081ab4e4b0802d69caad89  She left her husband. He killed their children. Just another day in America. 2018-05-26
1  ENTERTAINMENT  Will Smith Joins Diplo And Nicky Jam For The 2018 World Cup's Official Song    Andy McDonald  https://www.huffingtonpost.com/entry/will-smith-joins-diplo-and-nicky-jam-for-the-official-2018-...                                                     Of course, it has a song. 2018-05-26
  • Related