I have been trying for hours to make this work. I am trying to read from a file an organize all the data into a dictionary and than combine all the dictionaries into a list. However I have gotten as far as to get all the data into dictionaries but whenever I try and append them to a =list each iteration, it either totally breaks, or only the last line of the file gets added??`def
readParkFile(fileName="national_parks.csv"):
f = open(fileName, "r")
parkList = []
parkDict = {}
header = f.readline().split(",")
keys = list(header)
for line in f:
tmp = list(line.split(","))
quote = tmp[7:]
fullQuote = ','.join(quote)
del tmp[7:]
tmp.append(fullQuote)
for value in tmp:
parkDict[keys[tmp.index(value)]] = value
parkList.append(parkDict)
print(parkDict)
readParkFile()`
Code,Name,State,Acres,Latitude,Longitude,Date,Description ACAD,Acadia National Park,ME,47390,44.35,-68.21,1919-02-26,"Covering most of Mount Desert Island and other coastal islands, Acadia features the tallest mountain on the Atlantic coast of the United States, granite peaks, ocean shoreline, woodlands, and lakes. There are freshwater, estuary, forest, and intertidal habitats." ARCH,Arches National Park,UT,76519,38.68,-109.57,1971-11-12,"This site features more than 2,000 natural sandstone arches, with some of the most popular arches in the park being Delicate Arch, Landscape Arch and Double Arch. Millions of years of erosion have created these structures in a desert climate where the arid ground has life-sustaining biological soil crusts and potholes that serve as natural water-collecting basins. Other geologic formations include stone pinnacles, fins, and balancing rocks." BADL,Badlands National Park,SD,242756,43.75,-102.5,1978-11-10,"The Badlands are a collection of buttes, pinnacles, spires, and mixed-grass prairies. The White River Badlands contain the largest assemblage of known late Eocene and Oligocene mammal fossils. The wildlife includes bison, bighorn sheep, black-footed ferrets, and prairie dogs." BIBE,Big Bend National Park,TX,801163,29.25,-103.25,1944-06-12,"Named for the prominent bend in the Rio Grande along the U.S.-Mexico border, this park encompasses a large and remote part of the Chihuahuan Desert. Its main attraction is backcountry recreation in the arid Chisos Mountains and in canyons along the river. A wide variety of Cretaceous and Tertiary fossils as well as cultural artifacts of Native Americans also exist within its borders." BISC,Biscayne National Park,FL,172924,25.65,-80.08,1980-06-28,"The central part of Biscayne Bay, this mostly underwater park at the north end of the Florida Keys has four interrelated marine ecosystems: mangrove forest, the Bay, the Keys, and coral reefs. Threatened animals include the West Indian manatee, American crocodile, various sea turtles, and peregrine falcon." BLCA,Black Canyon of the Gunnison National Park,CO,32950,38.57,-107.72,1999-10-21,"The park protects a quarter of the Gunnison River, which slices sheer canyon walls from dark Precambrian-era rock. The canyon features some of the steepest cliffs and oldest rock in North America, and is a popular site for river rafting and rock climbing. The deep, narrow canyon is composed of gneiss and schist, which appears black when in shadow." BRCA,Bryce Canyon National Park,UT,35835,37.57,-112.18,1928-02-25,"Bryce Canyon is a geological amphitheater on the Paunsaugunt Plateau with hundreds of tall, multicolored sandstone hoodoos formed by erosion. The region was originally settled by Native Americans and later by Mormon pioneers." CANY,Canyonlands National Park,UT,337598,38.2,-109.93,1964-09-12,"This landscape was eroded into a maze of canyons, buttes, and mesas by the combined efforts of the Colorado River, Green River, and their tributaries, which divide the park into three districts. The park also contains rock pinnacles and arches, as well as artifacts from Ancient Pueblo peoples." CARE,Capitol Reef National Park,UT,241904,38.2,-111.17,1971-12-18,"The park's Waterpocket Fold is a 100-mile (160 km) monocline that exhibits the earth's diverse geologic layers. Other natural features include monoliths, cliffs, and sandstone domes shaped like the United States Capitol." CAVE,Carlsbad Caverns National Park,NM,46766,32.17,-104.44,1930-05-14,"Carlsbad Caverns has 117 caves, the longest of which is over 120 miles (190 km) long. The Big Room is almost 4,000 feet (1,200 m) long, and the caves are home to over 400,000 Mexican free-tailed bats and sixteen other species. Above ground are the Chihuahuan Desert and Rattlesnake Springs." CHIS,Channel Islands National Park,CA,249561,34.01,-119.42,1980-03-05,"Five of the eight Channel Islands are protected, with half of the park's area underwater. The islands have a unique Mediterranean ecosystem originally settled by the Chumash people. They are home to over 2,000 species of land plants and animals, 145 endemic to them, including the island fox. Ferry services offer transportation to the islands from the mainland." CONG,Congaree National Park,SC,26546,33.78,-80.78,2003-11-10,"On the Congaree River, this park is the largest portion of old-growth floodplain forest left in North America. Some of the trees are the tallest in the eastern United States. An elevated walkway called the Boardwalk Loop guides visitors through the swamp." CRLA,Crater Lake National Park,OR,183224,42.94,-122.1,1902-05-22,"Crater Lake lies in the caldera of an ancient volcano called Mount Mazama that collapsed 7,700 years ago. The lake is the deepest in the United States and is noted for its vivid blue color and water clarity. Wizard Island and the Phantom Ship are more recent volcanic formations within the caldera. As the lake has no inlets or outlets, it is replenished only by precipitation." CUVA,Cuyahoga Valley National Park,OH,32950,41.24,-81.55,2000-10-11,"This park along the Cuyahoga River has waterfalls, hills, trails, and exhibits on early rural living. The Ohio and Erie Canal Towpath Trail follows the Ohio and Erie Canal, where mules towed canal boats. The park has numerous historic homes, bridges, and structures, and also offers a scenic train ride."
Expected output:
[{“code”:, “name”: …},{“code”:, “name”: …}, {“code”:, “name”: …}]
CodePudding user response:
I copied your input, added newlines before each row in the CSV (see here), and moved the initialization of the dictionary into the loop. Run this code:
from pprint import pprint
def readParkFile(fileName="temp.csv"):
f = open(fileName, "r")
parkList = []
header = f.readline().split(",")
keys = list(header)
for line in f:
parkDict = {} # This is the issue! Moving the initialization of this dictionary into the loop.
tmp = list(line.split(","))
quote = tmp[7:]
fullQuote = ','.join(quote)
del tmp[7:]
tmp.append(fullQuote)
for value in tmp:
parkDict[keys[tmp.index(value)]] = value
# pprint(parkDict)
parkList.append(parkDict)
# pprint(parkList)
if __name__ == '__main__':
readParkFile()
And you'll get this output:
[{'Acres': '47390',
'Code': 'ACAD',
'Date': '1919-02-26',
'Description \n': '"Covering most of Mount Desert Island and other coastal '
'islands, Acadia features the tallest mountain on the '
'Atlantic coast of the United States, granite peaks, ocean '
'shoreline, woodlands, and lakes. There are freshwater, '
'estuary, forest, and intertidal habitats." \n',
'Latitude': '44.35',
'Longitude': '-68.21',
'Name': 'Acadia National Park',
'State': 'ME'},
{'Acres': '76519',
'Code': 'ARCH',
'Date': '1971-11-12',
'Description \n': '"This site features more than 2\\,000 natural sandstone '
'arches, with some of the most popular arches in the park '
'being Delicate Arch, Landscape Arch and Double Arch. '
'Millions of years of erosion have created these '
'structures in a desert climate where the arid ground has '
'life-sustaining biological soil crusts and potholes that '
'serve as natural water-collecting basins. Other geologic '
'formations include stone pinnacles, fins, and balancing '
'rocks." \n',
'Latitude': '38.68',
'Longitude': '-109.57',
'Name': 'Arches National Park',
'State': 'UT'},
{'Acres': '242756',
'Code': 'BADL',
'Date': '1978-11-10',
'Description \n': '"The Badlands are a collection of buttes, pinnacles, '
'spires, and mixed-grass prairies. The White River '
'Badlands contain the largest assemblage of known late '
'Eocene and Oligocene mammal fossils. The wildlife '
'includes bison, bighorn sheep, black-footed ferrets, and '
'prairie dogs."\n',
'Latitude': '43.75',
'Longitude': '-102.5',
'Name': 'Badlands National Park',
'State': 'SD'},
{'Acres': '801163',
'Code': 'BIBE',
'Date': '1944-06-12',
'Description \n': '"Named for the prominent bend in the Rio Grande along the '
'U.S.-Mexico border, this park encompasses a large and '
'remote part of the Chihuahuan Desert. Its main attraction '
'is backcountry recreation in the arid Chisos Mountains '
'and in canyons along the river. A wide variety of '
'Cretaceous and Tertiary fossils as well as cultural '
'artifacts of Native Americans also exist within its '
'borders." \n',
'Latitude': '29.25',
'Longitude': '-103.25',
'Name': 'Big Bend National Park',
'State': 'TX'},
{'Acres': '172924',
'Code': 'BISC',
'Date': '1980-06-28',
'Description \n': '"The central part of Biscayne Bay, this mostly underwater '
'park at the north end of the Florida Keys has four '
'interrelated marine ecosystems: mangrove forest, the Bay, '
'the Keys, and coral reefs. Threatened animals include the '
'West Indian manatee, American crocodile, various sea '
'turtles, and peregrine falcon." \n',
'Latitude': '25.65',
'Longitude': '-80.08',
'Name': 'Biscayne National Park',
'State': 'FL'},
{'Acres': '32950',
'Code': 'BLCA',
'Date': '1999-10-21',
'Description \n': '"The park protects a quarter of the Gunnison River, which '
'slices sheer canyon walls from dark Precambrian-era rock. '
'The canyon features some of the steepest cliffs and '
'oldest rock in North America, and is a popular site for '
'river rafting and rock climbing. The deep, narrow canyon '
'is composed of gneiss and schist, which appears black '
'when in shadow." \n',
'Latitude': '38.57',
'Longitude': '-107.72',
'Name': 'Black Canyon of the Gunnison National Park',
'State': 'CO'},
{'Acres': '35835',
'Code': 'BRCA',
'Date': '1928-02-25',
'Description \n': '"Bryce Canyon is a geological amphitheater on the '
'Paunsaugunt Plateau with hundreds of tall, multicolored '
'sandstone hoodoos formed by erosion. The region was '
'originally settled by Native Americans and later by '
'Mormon pioneers." \n',
'Latitude': '37.57',
'Longitude': '-112.18',
'Name': 'Bryce Canyon National Park',
'State': 'UT'},
{'Acres': '337598',
'Code': 'CANY',
'Date': '1964-09-12',
'Description \n': '"This landscape was eroded into a maze of canyons, '
'buttes, and mesas by the combined efforts of the Colorado '
'River, Green River, and their tributaries, which divide '
'the park into three districts. The park also contains '
'rock pinnacles and arches, as well as artifacts from '
'Ancient Pueblo peoples." \n',
'Latitude': '38.2',
'Longitude': '-109.93',
'Name': 'Canyonlands National Park',
'State': 'UT'},
{'Acres': '241904',
'Code': 'CARE',
'Date': '1971-12-18',
'Description \n': '"The park\'s Waterpocket Fold is a 100-mile (160 km) '
"monocline that exhibits the earth's diverse geologic "
'layers. Other natural features include monoliths, cliffs, '
'and sandstone domes shaped like the United States '
'Capitol." \n',
'Latitude': '38.2',
'Longitude': '-111.17',
'Name': 'Capitol Reef National Park',
'State': 'UT'},
{'Acres': '46766',
'Code': 'CAVE',
'Date': '1930-05-14',
'Description \n': '"Carlsbad Caverns has 117 caves, the longest of which is '
'over 120 miles (190 km) long. The Big Room is almost '
'4,000 feet (1,200 m) long, and the caves are home to over '
'400,000 Mexican free-tailed bats and sixteen other '
'species. Above ground are the Chihuahuan Desert and '
'Rattlesnake Springs." \n',
'Latitude': '32.17',
'Longitude': '-104.44',
'Name': 'Carlsbad Caverns National Park',
'State': 'NM'},
{'Acres': '249561',
'Code': 'CHIS',
'Date': '1980-03-05',
'Description \n': '"Five of the eight Channel Islands are protected, with '
"half of the park's area underwater. The islands have a "
'unique Mediterranean ecosystem originally settled by the '
'Chumash people. They are home to over 2,000 species of '
'land plants and animals, 145 endemic to them, including '
'the island fox. Ferry services offer transportation to '
'the islands from the mainland." \n',
'Latitude': '34.01',
'Longitude': '-119.42',
'Name': 'Channel Islands National Park',
'State': 'CA'},
{'Acres': '26546',
'Code': 'CONG',
'Date': '2003-11-10',
'Description \n': '"On the Congaree River, this park is the largest portion '
'of old-growth floodplain forest left in North America. '
'Some of the trees are the tallest in the eastern United '
'States. An elevated walkway called the Boardwalk Loop '
'guides visitors through the swamp." \n',
'Latitude': '33.78',
'Longitude': '-80.78',
'Name': 'Congaree National Park',
'State': 'SC'},
{'Acres': '183224',
'Code': 'CRLA',
'Date': '1902-05-22',
'Description \n': '"Crater Lake lies in the caldera of an ancient volcano '
'called Mount Mazama that collapsed 7,700 years ago. The '
'lake is the deepest in the United States and is noted for '
'its vivid blue color and water clarity. Wizard Island and '
'the Phantom Ship are more recent volcanic formations '
'within the caldera. As the lake has no inlets or outlets, '
'it is replenished only by precipitation." \n',
'Latitude': '42.94',
'Longitude': '-122.1',
'Name': 'Crater Lake National Park',
'State': 'OR'},
{'Acres': '32950',
'Code': 'CUVA',
'Date': '2000-10-11',
'Description \n': '"This park along the Cuyahoga River has waterfalls, '
'hills, trails, and exhibits on early rural living. The '
'Ohio and Erie Canal Towpath Trail follows the Ohio and '
'Erie Canal, where mules towed canal boats. The park has '
'numerous historic homes, bridges, and structures, and '
'also offers a scenic train ride."\n',
'Latitude': '41.24',
'Longitude': '-81.55',
'Name': 'Cuyahoga Valley National Park',
'State': 'OH'}]
CodePudding user response:
The input you provided is not really helpful. When providing input, try to make it easy to copy-paste. We can't copy and paste this into a .csv file.
Anyway, I made a small .csv for this and I think this code provides what you want:
finalList = []
with open('smth.csv','r') as file:
keys = file.readline().split(',')
line = file.readline().split(',')
while line: # Exits when readline returns an empty line
finalList.append({keys[0]: line[0],
keys[1]: line[1],
keys[2]: line[2]
})
line = file.readline()
You can also make this shorter by using file.readlines()
which returns all at once.
CodePudding user response:
(NOTE: You may want to look at using pandas for this. Please see the second half of this answer for an example.)
Direct answer to your question: If you want actual typed values (int for acres, float for latitude/longitude), here's a way to do it using regular expressions:
import re
def readParkFile(fileName="national_parks.csv"):
parkList = []
f = open(fileName, "r")
keys = f.readline().strip("\n").split(",")
for line in f:
v = re.search('(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*)', line).groups()
Code, Name, State, Acres, Latitude, Longitude, Date, Description = v[0], v[1], v[2], int(v[3]), float(v[4]), float(v[5]), v[6], v[7][1:-1]
parkDict = dict(zip(keys, [Code, Name, State, Acres, Latitude, Longitude, Date, Description]))
parkList.append(parkDict)
return parkList
parkList = readParkFile()
Inside the loop, re.search()
uses non-greedy qualifiers within all comma-separated groups but the last one (Description
), then converts numeric fields to numbers and strips the surrounding quotes from Description
. These typed values are then combined with the keys
using zip()
and converted to a dict
which in turn is appended to the result, parkList
. After looping over all the lines in the csv file, the function returns parkList
.
The first dictionary element of the resulting list of dicts looks like this:
{'Code': 'ACAD', 'Name': 'Acadia National Park', 'State': 'ME', 'Acres': 47390, 'Latitude': 44.35, 'Longitude': -68.21, 'Date': '1919-02-26', 'Description': 'Covering most of Mount Desert Island and other coastal islands, Acadia features the tallest mountain on the Atlantic coast of the United States, granite peaks, ocean shoreline, woodlands, and lakes. There are freshwater, estuary, forest, and intertidal habitats.'}
An alternative approach using pandas: In pandas, you could do this:
import pandas as pd
df = pd.read_csv("national_parks.csv")
print(df)
print(df.dtypes)
parkList = df.to_dict('records')
print(parkList[0])
It will give the following output:
Code Name State Acres Latitude Longitude Date Description
0 ACAD Acadia National Park ME 47390 44.35 -68.21 1919-02-26 Covering most of Mount Desert Island and other...
1 ARCH Arches National Park UT 76519 38.68 -109.57 1971-11-12 This site features more than 2,000 natural san...
2 BADL Badlands National Park SD 242756 43.75 -102.50 1978-11-10 The Badlands are a collection of buttes, pinna...
3 BIBE Big Bend National Park TX 801163 29.25 -103.25 1944-06-12 Named for the prominent bend in the Rio Grande...
4 BISC Biscayne National Park FL 172924 25.65 -80.08 1980-06-28 The central part of Biscayne Bay, this mostly ...
5 BLCA Black Canyon of the Gunnison National Park CO 32950 38.57 -107.72 1999-10-21 The park protects a quarter of the Gunnison Ri...
6 BRCA Bryce Canyon National Park UT 35835 37.57 -112.18 1928-02-25 Bryce Canyon is a geological amphitheater on t...
7 CANY Canyonlands National Park UT 337598 38.20 -109.93 1964-09-12 This landscape was eroded into a maze of canyo...
8 CARE Capitol Reef National Park UT 241904 38.20 -111.17 1971-12-18 The park's Waterpocket Fold is a 100-mile (160...
9 CAVE Carlsbad Caverns National Park NM 46766 32.17 -104.44 1930-05-14 Carlsbad Caverns has 117 caves, the longest of...
10 CHIS Channel Islands National Park CA 249561 34.01 -119.42 1980-03-05 Five of the eight Channel Islands are protecte...
11 CONG Congaree National Park SC 26546 33.78 -80.78 2003-11-10 On the Congaree River, this park is the larges...
12 CRLA Crater Lake National Park OR 183224 42.94 -122.10 1902-05-22 Crater Lake lies in the caldera of an ancient ...
13 CUVA Cuyahoga Valley National Park OH 32950 41.24 -81.55 2000-10-11 This park along the Cuyahoga River has waterfa...
Code object
Name object
State object
Acres int64
Latitude float64
Longitude float64
Date object
Description object
dtype: object
{'Code': 'ACAD', 'Name': 'Acadia National Park', 'State': 'ME', 'Acres': 47390, 'Latitude': 44.35, 'Longitude': -68.21, 'Date': '1919-02-26', 'Description': 'Covering most of Mount Desert Island and other coastal islands, Acadia features the tallest mountain on the Atlantic coast of the United States, granite peaks, ocean shoreline, woodlands, and lakes. There are freshwater, estuary, forest, and intertidal habitats.'}
As you can see, with a single call to read_csv()
, pandas parses the csv files, figures out the data types for each column, and assembles this all into a DataFrame object. You can then obtain the list of dicts you're trying for by calling the to_dict()
method of the DataFrame object with the argument 'records'
.