Home > Software design >  For loop doesn't read all the value in the list - dataframe
For loop doesn't read all the value in the list - dataframe

Time:04-30

My for loop does not read all the values in the range (2004,2012) which is super odd. When I try a simple function in my for loop such as return a, I do see that it read all the values in the range. However, when I use pd.read_json, it just does not work the same. I converted the data into dataframe but only one year is showing up in my dataframe. Am I missing something in my for loop?

test = range(2004, 2012)
testlist = list(test)

for i in testlist:
     a = f"https://api.census.gov/data/{i}/cps/basic/jun?get=GTCBSA,PEMNTVTY&for=state:*"
     b = pd.read_json(a) 
     c= pd.DataFrame(b.iloc[1:,]).set_axis(b.iloc[0,], axis="columns", inplace=False)
     c['year'] = i

enter image description here

CodePudding user response:

You're currently overwriting c in each pass of the loop. Instead, you need to concat the new data to the end of it:

test = range(2004, 2012)
testlist = list(test)

c = pd.DataFrame()
for i in testlist:
     a = f"https://api.census.gov/data/{i}/cps/basic/jun?get=GTCBSA,PEMNTVTY&for=state:*"
     b = pd.read_json(a) 
     b = pd.DataFrame(b.iloc[1:,]).set_axis(b.iloc[0,], axis="columns", inplace=False)
     b['year'] = i
     c = pd.concat([c, b])

Output:

0      GTCBSA PEMNTVTY state  year
1           0      316     2  2004
2           0       57     2  2004
3           0       57     2  2004
4           0       57     2  2004
5       22900       57     5  2004
...       ...      ...   ...   ...
133679      0      120    56  2011
133680      0       57    56  2011
133681      0       57    56  2011
133682      0       57    56  2011
133683      0       57    56  2011

[1087063 rows x 4 columns]

Note you don't need to convert a range to a list to iterate it. You can simply do

for i in range(2004, 2012):
  • Related