I have API data that looks like this below in JSON of one ip
address. Typical query contains 50 ip
devices.
[{'ip': '11.22.33.44',
'services': [{'port': 80,
'service_name': 'HTTP',
'transport_protocol': 'TCP'},
{'port': 1911, 'service_name': 'FOX', 'transport_protocol': 'TCP'},
{'port': 3011, 'service_name': 'HTTP', 'transport_protocol': 'TCP'},
{'port': 5011,
'service_name': 'HTTP',
'certificate': 'thisisgarbledrandomtextofdigitsandletters',
'transport_protocol': 'TCP'},
{'port': 47808, 'service_name': 'BACNET', 'transport_protocol': 'UDP'}],
'location': {'continent': 'North America',
'country': 'United States',
'country_code': 'US',
'city': 'Denver',
'postal_code': '80908',
'timezone': 'America/Denver',
'province': 'Colorado',
'coordinates': {'latitude': 39.0234, 'longitude': -104.6926},
'registered_country': 'United States',
'registered_country_code': 'US'},
'autonomous_system': {'asn': 7922,
'description': 'CHARTER-7922',
'bgp_prefix': '11.22.33.44/55',
'name': 'CHARTER-7922',
'country_code': 'US'},
'operating_system': {'part': 'o', 'source': 'OSI_APPLICATION_LAYER'},
'last_updated_at': '2022-08-23T12:51:39.427Z',
'dns': {'reverse_dns': {'names': ['somebusiness.net']}}}]
If I try with Pandas:
df = pd.DataFrame(data)
print(df.columns)
will return:
Index(['ip', 'services', 'location', 'autonomous_system', 'operating_system',
'last_updated_at', 'dns'],
dtype='object')
Is it possible in Pandas for each ip
in the query return all coordinates? In the example above its 'coordinates': {'latitude': 39.0234, 'longitude': -104.6926}
I think I can do without Pandas with something like:
loc_data = []
for device in page:
for info in device.keys():
try:
coords = device['location']['coordinates']
loc_data.append(coords)
except:
continue
This will print:
[{'latitude': 39.0234, 'longitude': -104.6926},
{'latitude': 39.0234, 'longitude': -104.6926},
{'latitude': 39.0234, 'longitude': -104.6926},
{'latitude': 39.0234, 'longitude': -104.6926},
{'latitude': 39.0234, 'longitude': -104.6926},
{'latitude': 39.0234, 'longitude': -104.6926}]
But am hoping to just use Pandas if its possible? Hopefully this makes sense!
Thanks
CodePudding user response:
try this:
df = pd.DataFrame(data)
data["coordinates"]=data["location"].apply(lambda x:x["coordinates"])
if you mean by using only pandas
to create a dataframe with the coordinates
columns from start It is not the solution