Home > OS >  problem with grouping the earthquake data
problem with grouping the earthquake data

Time:10-26

Please help me implement a function that will group the earthquake data by the following criteria:

  • Light (4.5-4.9)

  • Moderate(5-5.9)

  • Major (6-6.9)

  • Strong (>=7)

I'm trying to add this to my simple program that collects data about earthquakes from a webpage.

import json
import urllib.request
handle = urllib.request.urlopen("https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/4.5_month.geojson")
rawdata = handle.read()
rawdata[:400]

data = json.loads(rawdata)
data

def get_earthquake_magnitude(json_data):
    results = []
    for eq in json_data['features']:
        mag = eq['properties']['mag']
        results.append(mag)
    return results

get_earthquake_magnitude(data)

CodePudding user response:

Using a library like Pandas might be the better way to do this. But if you want to do this just with basic Python code, here's a way:

import json
import urllib.request
from collections import defaultdict
from pprint import pprint

handle = urllib.request.urlopen("https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/4.5_month.geojson")
rawdata = handle.read()

data = json.loads(rawdata)

ranges = [
    ("Light", 5),
    ("Moderate", 6),
    ("Major", 7)]

def get_earthquake_magnitude(json_data):
    groups = defaultdict(list)
    for eq in json_data['features']:
        mag = eq['properties']['mag']
        for range in ranges:
            if mag < range[1]:
                groups[range[0]].append(eq)
                break
            else:
                groups["Strong"] = eq
    return groups

r = get_earthquake_magnitude(data)

for label, group in r.items():
    print(f"{label}: {len(group)}")

Result:

Light: 531
Strong: 4
Moderate: 118
Major: 8

This will give you a dictionary where the keys are each of the four labels for magnitude, and the values are lists of the complete data record for each earthquake in that category.

CodePudding user response:

In addition to the implementation above, I am just adding a minimal example. Hope it could be of some help.

from collections import defaultdict

# Light (4.5-4.9)
# Moderate(5-5.9)
# Major (6-6.9)
# Strong (>=7)

# vals denoting the earthquake data
vals = [4.79, 4.58, 5.8]
mydt = {
"Light": lambda x: (x > 4.5) and (x < 4.9),
"Moderate": lambda x: (x > 5) and (x < 5.9)
       }
 dt = defaultdict(list)
 for val in vals:
   for k, v in mydt.items():
     if v(val):
       dt[k].append(val)
 print(f"dt = {dt}")
 
 # Output:
 dt = defaultdict(<class 'list'>, {'Light': [4.79, 4.58], 'Moderate': [5.8]})

CodePudding user response:

The pandas library is great for something like this. Try as follows:

  • Use json_normalize to create a DataFrame.
  • Use pd.cut to bin the values from column properties.mag into discrete intervals.
import pandas as pd

df = pd.json_normalize(data['features'])

bins = [0, 5, 6, 7, 100]
labels = ['Light','Moderate','Major','Strong']

df['bins'] = pd.cut(df['properties.mag'], bins=bins, 
                    labels=labels, right=False)

Let's check how many entries we have for the bins, using df.groupby and applying count to column properties.mag:

counts = df.groupby('bins')['properties.mag'].count()

print(counts)
bins
Light       531
Moderate    118
Major         8
Strong        0
Name: properties.mag, dtype: int64

Use df.describe and df.info to get a statistical summary and some basic info on your df respectively.

  • Related