I have json file something looks like as below and I want to convert this json file into csv file. When I run the code, it gives me value error. Can someone please let me know how to convert below json into csv?
[{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-05 00:00:00.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 350.1, \"/TRUE_WIND_SPEED\":22.9}"},{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-04 23:59:30.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 351.4, \"/TRUE_WIND_SPEED\" : 25.85}"},{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-04 23:59:00.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 354.5, \"/TRUE_WIND_SPEED\" : 24.3}"},{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-04 23:58:30.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 351.9, \"/TRUE_WIND_SPEED\" : 23.1}"},{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-04 23:58:00.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 354.1, \"/TRUE_WIND_SPEED\" : 24.9}"},{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-04 23:57:30.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 4.7, \"/TRUE_WIND_SPEED\" : 21.4}"},{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-04 23:57:00.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 3.4, \"/TRUE_WIND_SPEED\" : 22.4}"},{"VESSEL_NO":"99999","SYSTEM_NAME":"ANEMOMETER","SYSTEM_INSTANCE":"1","EVENT_TIME":"2022-03-04 23:56:30.000","DATA":"{\"/TRUE_WIND_DIRECTION\" : 358.1, \"/TRUE_WIND_SPEED\" : 25.3}"}]
import pandas as pd
df = pd.read_json('data.json')
ValueError: Expected object or value
CodePudding user response:
read_json()
is used to read JSON files:
import pandas as pd
df = pd.read_json("data.json")
df.to_csv("data.csv", index=False)
CodePudding user response:
you can use json_normalize():
import json
import ast
with open ('your_json_file_path') as json_file:
json = json.load(json_file)
df=pd.DataFrame(data={'a':json)
df=df.join(df['a'].apply(pd.Series)).drop(['a'],axis=1).drop_duplicates()
df['DATA']=df['DATA'].apply(ast.literal_eval)
df=df.join(pd.json_normalize(df.pop('DATA')))
df
VESSEL_NO SYSTEM_NAME SYSTEM_INSTANCE EVENT_TIME /TRUE_WIND_DIRECTION /TRUE_WIND_SPEED
0 99999 ANEMOMETER 1 2022-03-05 00:00:00.000 350.1 22.9
1 99999 ANEMOMETER 1 2022-03-04 23:59:30.000 351.4 25.85
2 99999 ANEMOMETER 1 2022-03-04 23:59:00.000 354.5 24.3
3 99999 ANEMOMETER 1 2022-03-04 23:58:30.000 351.9 23.1
4 99999 ANEMOMETER 1 2022-03-04 23:58:00.000 354.1 24.9
5 99999 ANEMOMETER 1 2022-03-04 23:57:30.000 4.7 21.4
6 99999 ANEMOMETER 1 2022-03-04 23:57:00.000 3.4 22.4
7 99999 ANEMOMETER 1 2022-03-04 23:56:30.000 358.1 25.3
CodePudding user response:
For me working double json_normalize
:
import json
with open ('data.json') as file:
data = json.load(file)
df = pd.json_normalize(data)
df = df.join(pd.json_normalize(df.pop('DATA').apply(json.loads)))
print (df)
VESSEL_NO SYSTEM_NAME SYSTEM_INSTANCE EVENT_TIME \
0 99999 ANEMOMETER 1 2022-03-05 00:00:00.000
1 99999 ANEMOMETER 1 2022-03-04 23:59:30.000
2 99999 ANEMOMETER 1 2022-03-04 23:59:00.000
3 99999 ANEMOMETER 1 2022-03-04 23:58:30.000
4 99999 ANEMOMETER 1 2022-03-04 23:58:00.000
5 99999 ANEMOMETER 1 2022-03-04 23:57:30.000
6 99999 ANEMOMETER 1 2022-03-04 23:57:00.000
7 99999 ANEMOMETER 1 2022-03-04 23:56:30.000
/TRUE_WIND_DIRECTION /TRUE_WIND_SPEED
0 350.1 22.90
1 351.4 25.85
2 354.5 24.30
3 351.9 23.10
4 354.1 24.90
5 4.7 21.40
6 3.4 22.40
7 358.1 25.30