Home > Enterprise >  Im having troubles creating a dicitionary due to duplicated years- Python/Hurricane Project
Im having troubles creating a dicitionary due to duplicated years- Python/Hurricane Project

Time:04-08

I´m doing the hurricane project of Coadeacademy.

See below the variable and values of the exercise.It is sample of 34 hurricanes. Be aware that some years it had 2 hurricanes. For example in 1933, we had both, the hurricane 'Bahamas' and 'Cuba II'.

names of hurricanes

names = ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael']

months of hurricanes

`months = ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October']`

years of hurricanes

`years = [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018]`

maximum sustained winds (mph) of hurricanes

 max_sustained_winds = [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160]

areas affected by each hurricane

areas_affected = [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']]

damages (USD($)) of hurricanes

damages = ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B']

deaths for each hurricane

deaths = [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74]

1st question is to write a hurricane dictionary function with name as the key:

I created the following function which worked great.

def hurricane_dict(names, month, year, sustained_winds, areas_affected, damage, death):
    hurricane = {}
    for i in range(len(names)):
        hurricane[names[i]] = {"Name": names[i], 
                               "Month": month[i],
                               "Year" : year[i],
                                "Max Sustained Wind": sustained_winds[i],
                                "Areas Affected": areas_affected[i],
                                "Damage": damage[i],
                                "Deaths": death[i]}
    return hurricane
    

hurricane = hurricane_dict(names, months, years,max_sustained_winds, areas_affected, update_damages, deaths)

hurricane['Cuba I']
Output: {'Name': 'Cuba I',
 'Month': 'October',
 'Year': 1924,
 'Max Sustained Wind': 165,
 'Areas Affected': ['Central America',
  'Mexico',
  'Cuba',
  'Florida',
  'The Bahamas'],
 'Damage': 'Damages not recorded',
 'Deaths': 90}

2st question is to write again a hurricane dictionary function but usin the year as the key:

I could have done the same logic from before and construct the dictionary, howevery I am trying to use the existing dictionary (hurricane) as a parameter to build the new dictionary. See below the coding:

def hurricane_by_year(dictionary):
    for name in names:
        for year in years:
            if year == hurricane[name]['Year']:
                hurricanes_by_year_v2[year] = hurricane[name]
    return  hurricanes_by_year_v2

hurricanes_by_year_v2[1924]
Output: {'Name': 'Cuba I',
 'Month': 'October',
 'Year': 1924,
 'Max Sustained Wind': 165,
 'Areas Affected': ['Central America',
  'Mexico',
  'Cuba',
  'Florida',
  'The Bahamas'],
 'Damage': 'Damages not recorded',
 'Deaths': 90}

At first glance, the function and dicitionary looked Ok, however it is not recording all the samples of the data. Only is recording the first hurricane of the years, and if another hurricane had happened in the same year, it doesn´t show. The full sample is 34, and the created dicitionary only have 26 values.

print(range(len(hurricanes_by_year_v2)))
range(0, 26)

I would hihgly appreciate if someone can help me to create correct the function and create a complete dictionary with Years as the key and using the prior dictionary as a parameter.

Thanks in advance, Mijail

CodePudding user response:

Every year may have many values so you should use list for all values in year.

def hurricane_by_year(hurricanes):
    results = {}
    
    for name, data in hurricanes.items():
        year = data['Year']
        
        if year not in results:
            results[year] = []      # create list for all values
            
        results[year].append(data)  # add to list
        
    return results

hurricanes_by_year_v2 = hurricane_by_year(hurricane)

Full working code:

names = ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael']
months = ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October']
years = [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018]
max_sustained_winds = [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160]
areas_affected = [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']]
damages = ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B']
deaths = [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74]

# ----------------------------------------

def hurricane_dict(names, month, year, sustained_winds, areas_affected, damage, death):
    results = {}
    
    for data in zip(names, month, year, sustained_winds, areas_affected, damage, death):
        results[data[0]] = {
            "Name" : data[0], 
            "Month": data[1],
            "Year" : data[2],
            "Max Sustained Wind": data[3],
            "Areas Affected"    : data[4],
            "Damage": data[5],
            "Deaths": data[6]
        }

    return results

hurricane = hurricane_dict(names, months, years, max_sustained_winds, areas_affected, damages, deaths)

#print(hurricane['Cuba II'])

def hurricane_by_year(hurricanes):
    results = {}
    
    for name, data in hurricane.items():
        year = data['Year']
        
        if year not in results:
            results[year] = []
            
        results[year].append(data)
        
    return results

hurricanes_by_year_v2 = hurricane_by_year(hurricane)

print('\n--- year 1932 ---\n')

for item in hurricanes_by_year_v2[1932]:
    print(item)
    print('---')

Result:

--- year 1932 ---

{'Name': 'Bahamas', 'Month': 'September', 'Year': 1932, 'Max Sustained Wind': 160, 'Areas Affected': ['The Bahamas', 'Northeastern United States'], 'Damage': 'Damages not recorded', 'Deaths': 16}
---
{'Name': 'Cuba II', 'Month': 'November', 'Year': 1932, 'Max Sustained Wind': 175, 'Areas Affected': ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], 'Damage': '40M', 'Deaths': 3103}
---    

EDIT:

I think it would be simpler to use DataFrame.

  • it can simply select by year, month, name.
  • it can filter in some ranges <, >.
  • it can make calculations like sum, average/mean.
  • it can plot it.
import pandas as pd

df = pd.DataFrame({
    'Name': ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael'],
    'Month': ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October'],
    'Year': [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018],
    'Max sustained winds': [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160],
    'Areas affected': [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']],
    'Damages': ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B'],
    'Deaths': [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74],
})

#groups = df.groupby('years')

print('\n--- Year 1932 ---\n')

selected = df[ df['Year'] == 1932 ]
print( selected )

print('\n--- Name contains Cuba ---\n')

selected = df[ df['Name'].str.contains('Cuba') ]
print( selected )

print('\n--- Month August ---\n')

selected = df[ df['Month'] == 'August' ]
print( selected )

print('\n--- Deaths < 20 ---\n')

selected = df[ df['Deaths'] < 20 ].sort_values('Deaths')
print( selected[ ['Deaths', 'Year'] ] )

print('\n--- sum Deaths ---\n')

result = df['Deaths'].sum()
print( result )
print( f'{result:_}' )  # display with `_` to make it more readble

print('\n--- Area Mexico ---\n')

selected = df[ df['Areas affected'].apply(lambda item: 'Mexico' in item)  ]
print( selected[ ['Year', 'Areas affected'] ].to_string() )  # `to_string()` to display without `...`

# ---

import matplotlib.pyplot as plt

df.plot(x='Year', y='Deaths')
plt.show()

Result:

--- Year 1932 ---

      Name      Month  ...               Damages  Deaths
2  Bahamas  September  ...  Damages not recorded      16
3  Cuba II   November  ...                   40M    3103

[2 rows x 7 columns]

--- Name contains Cuba ---

              Name     Month  ...               Damages  Deaths
0           Cuba I   October  ...  Damages not recorded      90
3          Cuba II  November  ...                   40M    3103
4  CubaBrownsville    August  ...                 27.9M     179

[3 rows x 7 columns]

--- Month August ---

               Name   Month  ...  Damages  Deaths
4   CubaBrownsville  August  ...    27.9M     179
13          Camille  August  ...    1.42B     259
16            David  August  ...    1.54B    2068
17            Allen  August  ...    1.24B     269
20           Andrew  August  ...    26.5B      65
25          Katrina  August  ...     125B    1836
28             Dean  August  ...    1.76B      45

[7 rows x 7 columns]

--- Deaths < 20 ---

    Deaths  Year
8        5  1953
15      11  1977
2       16  1932
24      17  2005

--- sum Deaths ---

39489
39_489

--- Area Mexico ---

    Year                                                      Areas affected
0   1924               [Central America, Mexico, Cuba, Florida, The Bahamas]
12  1967                                      [The Caribbean, Mexico, Texas]
14  1971  [The Caribbean, Central America, Mexico, United States Gulf Coast]
15  1977                                                            [Mexico]
17  1980              [The Caribbean, Yucatn Peninsula, Mexico, South Texas]
18  1988           [Jamaica, Venezuela, Central America, Hispaniola, Mexico]
24  2005                          [Windward Islands, Jamaica, Mexico, Texas]

enter image description here

  • Related