I am looking at the following website: https://data.gov.sg/dataset/bunker-sales-monthly?resource_id=44da3191-6c57-4d4a-8268-8e2c418d4b43 and they have the following example for extracting data using their API:
import requests
result = []
headers = {'User-Agent': 'Mozilla/5.0'}
url = "https://data.gov.sg/api/action/datastore_search?resource_id=44da3191-6c57-4d4a-8268-8e2c418d4b43"
r = requests.get(url, headers=headers)
data = r.json()
print(data)
This produces the following, out of which I want to extract only the 'records' bit out of the output, and into a more readable format. Ideally, I want this data in a pandas data frame:
{'records': [{'bunker_type': 'Marine Gas Oil', 'bunker_sales': '135.4', '_id': 1, 'month': '1995-01'}, {'bunker_type': 'Marine Diesel Oil', 'bunker_sales': '67.9', '_id': 2, 'month': '1995-01'}, {'bunker_type': 'Marine Fuel Oil 180 cst', 'bunker_sales': '412.9', '_id': 3, 'month': '1995-01'}, {'bunker_type': 'Marine Fuel Oil 380 cst', 'bunker_sales': '820.3', '_id': 4, 'month': '1995-01'}, {'bunker_type': 'Marine Fuel Oil 500 cst ', 'bunker_sales': '0', '_id': 5, 'month': '1995-01'}...
In the format as below:
df= pd.DataFrame(columns=['month', 'bunker_type', 'bunker_sales'])
How could I go about extracting this data?
CodePudding user response:
Try pd.json_normalize
:
df = pd.json_normalize(data['result'], 'records')
print(df)
# Output:
bunker_type bunker_sales _id month
0 Marine Gas Oil 135.4 1 1995-01
1 Marine Diesel Oil 67.9 2 1995-01
2 Marine Fuel Oil 180 cst 412.9 3 1995-01
3 Marine Fuel Oil 380 cst 820.3 4 1995-01
4 Marine Fuel Oil 500 cst 0 5 1995-01
.. ... ... ... ...
95 Ultra Low-Sulfur Fuel Oil 0 95 1995-08
96 Others 102.2 96 1995-08
97 Marine Gas Oil 101.7 97 1995-09
98 Marine Diesel Oil 63 98 1995-09
99 Marine Fuel Oil 180 cst 395 99 1995-09
[100 rows x 4 columns]