I work for a property company and am trying to create a frequently updated excel sheet with all of company's data on units/tenants/properties/etc. We currently make use of a property management site which contains all of our data. The site has an api which I should be able to use to access the data.
I have tried importing the data from the site's api, and have been successful in importing it. With that being said, the imported data is only in the format of a json file and I have been having trouble converting it to a csv.
Here is my current program.
'''all imports needed'''
import requests
import pandas as pd
import csv
import json
#get data from api
url ='https://*apiusername:apisecretkey@companyname*.appfolio.com/api/v1/reports/rent_roll.json?columns=Property,PropertyName,PropertyGroupId,PropertyAddress,Unit,Tenant,BdBa,SquareFt,MarketRent,Rent,MoveIn,MoveOut'
try:
response = requests.get(url).json()
print("successfully imported json data from appfolio api.")
except IOError:
print("I/O Error")
#flatten json dictionary just in case
def flattenjson(b, delim):
print("attempting to flatten json dictionary.")
val = {}
for i in b.keys():
if isinstance(b[i], dict):
get = flattenjson(b[i], delim)
for j in get.keys():
val[i delim j]= get[j]
else:
val[i] = b[i]
return val
test = flattenjson(response, ',')
#print output for test
print(test)
#confirm that test variable is in dictionary format
if isinstance(test, dict):
print("your imported file is a dictionary.")
#convert dictionary to a csv with pandas
try:
df = pd.DataFrame.from_dict(test)
print("converted dictionary to pandas dataframe.\n\n")
except:
print("Error")
try:
df.to_csv("data.csv")
print("successfully converted dataframe to csv file. attempting to read back data.\n\n")
df = pd.read_csv("data.csv", index_col=0)
print(df)
except:
print("Error")
If I print the imported json dictionary before converting it to a csv, this is a sample of what the current structure of it looks like.
{'results': [{'Property': '1020p - 1024 N. roadname Pkwy. Cityname, XX 12345', 'PropertyName': '1020p', 'PropertyGroupId': '418024, 418031, 418057, 418068, 418069, 418073, 418077', 'PropertyAddress': '1020 N. roadname Pkwy. Cityname, XX 12345', 'Unit': 'Commercial- Loop Lofts Unit B', 'Tenant': None, 'BdBa': '--/--', 'SquareFt': '4,888', 'MarketRent': '4,000.00', 'Rent': None, 'MoveIn': None, 'MoveOut': None}, {'Property': '1020p - 1024 N. roadname Pkwy. Cityname, XX 12345, 'PropertyName': '1020p', 'PropertyGroupId': '418024, 418031, 418057, 418068, 418069, 418073, 418077', 'PropertyAddress': '1020 N. roadname Pkwy. Cityname, XX 12345', 'Unit': '100', 'Tenant': 'John Roberts', 'BdBa': '1/1.00', 'SquareFt': '930', 'MarketRent': '1,075.00', 'Rent': '1,075.00', 'MoveIn': '10/17/2021', 'MoveOut': None}], 'next_page_url': None}
I believe that due to how the API works, it is making a nested dictionary with the first key being labelled results, and the final key being labelled next_page_url.
Due to this, I believe that when converting the dictionary to a csv with pandas, it is putting all my keys that have to deal with property data in their own column. And this is something that I would like to change. My current converted csv output looks like this.
results next_page_url
0 {'Property': '1020p - 1024 N. roadname Pkwy. St... NaN
1 {'Property': '1020p - 1024 N. roadname Pkwy. St... NaN
2 {'Property': '1020p - 1024 N. roadname Pkwy. St... NaN
3 {'Property': '1020p - 1024 N. roadname Pkwy. St... NaN
4 {'Property': '1020p - 1024 N. roadname Pkwy. St... NaN
.. ... ...
639 {'Property': 'putinvest - 4240 something Ave.... NaN
640 {'Property': 'putmgr - 4240 something Ave. St... NaN
641 {'Property': 'z4184p - 4184 Something Ave. Jo... NaN
642 {'Property': 'z4400p - 4400 Boardwalk Name St. ... NaN
643 {'Property': 'z4514 - 4514 something Ave. St. Lo... NaN
[644 rows x 2 columns]
Ideally here, each column from the api such as PropertyName, Tenant, PropertyAddress, etc. would all have their own columns in the csv. Could anyone tell me how to go about formatting it that way?
Thank you!
CodePudding user response:
Do you need next_page_url
in the final CSV?
Did you try doing this ?
df = pd.DataFrame(test["results"])
CodePudding user response:
I had trouble with the formatting used in that .json example, but you actually can solve this an easier way using the json library you have imported.
I usually go about it by saving the request response:
response = requests.request("GET", new_url, headers=headers, data=payload)
Then I use the .loads() function which loads it into a python dict, which then you can index.
results = json.loads(response.text)['results'][0]
will give you
[{'Property': '1020p - 1024 N. roadname Pkwy. Cityname, XX 12345', 'PropertyName': '1020p', 'PropertyGroupId': '418024, 418031, 418057, 418068, 418069, 418073, 418077', 'PropertyAddress': '1020 N. roadname Pkwy. Cityname, XX 12345', 'Unit': 'Commercial- Loop Lofts Unit B', 'Tenant': None, 'BdBa': '--/--', 'SquareFt': '4,888', 'MarketRent': '4,000.00', 'Rent': None, 'MoveIn': None, 'MoveOut': None}, {'Property': '1020p - 1024 N. roadname Pkwy. Cityname, XX 12345, 'PropertyName': '1020p', 'PropertyGroupId': '418024, 418031, 418057, 418068, 418069, 418073, 418077', 'PropertyAddress': '1020 N. roadname Pkwy. Cityname, XX 12345', 'Unit': '100', 'Tenant': 'John Roberts', 'BdBa': '1/1.00', 'SquareFt': '930', 'MarketRent': '1,075.00', 'Rent': '1,075.00', 'MoveIn': '10/17/2021', 'MoveOut': None}]
Then, since the columns have the same name, its as easy as using pandas to create a df, then convert it to a csv.
pd.DataFrame(results).to_csv('~/filename.csv')