Home > Blockchain >  pandas df of api query
pandas df of api query

Time:08-26

I have API data that looks like this below in JSON of one ip address. Typical query contains 50 ip devices.

[{'ip': '11.22.33.44',
  'services': [{'port': 80,
    'service_name': 'HTTP',
    'transport_protocol': 'TCP'},
   {'port': 1911, 'service_name': 'FOX', 'transport_protocol': 'TCP'},
   {'port': 3011, 'service_name': 'HTTP', 'transport_protocol': 'TCP'},
   {'port': 5011,
    'service_name': 'HTTP',
    'certificate': 'thisisgarbledrandomtextofdigitsandletters',
    'transport_protocol': 'TCP'},
   {'port': 47808, 'service_name': 'BACNET', 'transport_protocol': 'UDP'}],
  'location': {'continent': 'North America',
   'country': 'United States',
   'country_code': 'US',
   'city': 'Denver',
   'postal_code': '80908',
   'timezone': 'America/Denver',
   'province': 'Colorado',
   'coordinates': {'latitude': 39.0234, 'longitude': -104.6926},
   'registered_country': 'United States',
   'registered_country_code': 'US'},
  'autonomous_system': {'asn': 7922,
   'description': 'CHARTER-7922',
   'bgp_prefix': '11.22.33.44/55',
   'name': 'CHARTER-7922',
   'country_code': 'US'},
  'operating_system': {'part': 'o', 'source': 'OSI_APPLICATION_LAYER'},
  'last_updated_at': '2022-08-23T12:51:39.427Z',
  'dns': {'reverse_dns': {'names': ['somebusiness.net']}}}]

If I try with Pandas:

df = pd.DataFrame(data)

print(df.columns)

will return:

Index(['ip', 'services', 'location', 'autonomous_system', 'operating_system',
       'last_updated_at', 'dns'],
      dtype='object')

Is it possible in Pandas for each ip in the query return all coordinates? In the example above its 'coordinates': {'latitude': 39.0234, 'longitude': -104.6926}

I think I can do without Pandas with something like:

loc_data = []
for device in page:
    for info in device.keys():
        try:
            coords = device['location']['coordinates']
            loc_data.append(coords)
        except:
            continue

This will print:

[{'latitude': 39.0234, 'longitude': -104.6926},
 {'latitude': 39.0234, 'longitude': -104.6926},
 {'latitude': 39.0234, 'longitude': -104.6926},
 {'latitude': 39.0234, 'longitude': -104.6926},
 {'latitude': 39.0234, 'longitude': -104.6926},
 {'latitude': 39.0234, 'longitude': -104.6926}]

But am hoping to just use Pandas if its possible? Hopefully this makes sense!

Thanks

CodePudding user response:

try this:

df = pd.DataFrame(data)
data["coordinates"]=data["location"].apply(lambda x:x["coordinates"])

if you mean by using only pandas to create a dataframe with the coordinates columns from start It is not the solution

  • Related