My for loop does not read all the values in the range (2004,2012) which is super odd. When I try a simple function in my for loop such as return a, I do see that it read all the values in the range. However, when I use pd.read_json, it just does not work the same. I converted the data into dataframe but only one year is showing up in my dataframe. Am I missing something in my for loop?
test = range(2004, 2012)
testlist = list(test)
for i in testlist:
a = f"https://api.census.gov/data/{i}/cps/basic/jun?get=GTCBSA,PEMNTVTY&for=state:*"
b = pd.read_json(a)
c= pd.DataFrame(b.iloc[1:,]).set_axis(b.iloc[0,], axis="columns", inplace=False)
c['year'] = i
CodePudding user response:
You're currently overwriting c
in each pass of the loop. Instead, you need to concat
the new data to the end of it:
test = range(2004, 2012)
testlist = list(test)
c = pd.DataFrame()
for i in testlist:
a = f"https://api.census.gov/data/{i}/cps/basic/jun?get=GTCBSA,PEMNTVTY&for=state:*"
b = pd.read_json(a)
b = pd.DataFrame(b.iloc[1:,]).set_axis(b.iloc[0,], axis="columns", inplace=False)
b['year'] = i
c = pd.concat([c, b])
Output:
0 GTCBSA PEMNTVTY state year
1 0 316 2 2004
2 0 57 2 2004
3 0 57 2 2004
4 0 57 2 2004
5 22900 57 5 2004
... ... ... ... ...
133679 0 120 56 2011
133680 0 57 56 2011
133681 0 57 56 2011
133682 0 57 56 2011
133683 0 57 56 2011
[1087063 rows x 4 columns]
Note you don't need to convert a range
to a list
to iterate it. You can simply do
for i in range(2004, 2012):