Reading specific information from a file to a dictionary-CodePudding

I have text file with data about the composition of the lunar atmosphere. It looks like this:

Estimated Composition (night, particles per cubic cm): Helium 4 - 40,000 ; Neon 20 - 40,000 ; Hydrogen - 35,000 Argon 40 - 30,000 ; Neon 22 - 5,000 ; Argon 36 - 2,000 Methane - 1000 ; Ammonia - 1000 ; Carbon Dioxide - 1000

I am supposed to write a function to read such a file and return a dictionary with the names of the elements as keys and the particle density as values. So far i have written this:

def read_file(filename):
    infile = open(filename, "r")
    for line in infile:
        words = line.split()
        if words[0] == "Helium":
             data = {words[0]:words[3]}
    print(data)
    return
read_file("atm_moon.txt")

which returns{"Helium": "40,000}. I'm sure there's a way to do this for every key and value with a loop, but i don't know how.

CodePudding user response：

You can try the below

text = '''Estimated Composition (night, particles per cubic cm): Helium 4 - 40,000 ; Neon 20 - 40,000 ; Hydrogen - 35,000 ; Argon 40 - 30,000 ; Neon 22 - 5,000 ; Argon 36 - 2,000 ; Methane - 1000 ; Ammonia - 1000 ; Carbon Dioxide - 1000'''
parts = text[text.find(':')   2:].split(' ; ')
data = {}
for p in parts:
  k,v = p.split(' - ')
  data[k] = v

print(data)

output

{'Helium 4': '40,000', 'Neon 20': '40,000', 'Hydrogen': '35,000', 'Argon 40': '30,000', 'Neon 22': '5,000', 'Argon 36': '2,000', 'Methane': '1000', 'Ammonia': '1000', 'Carbon Dioxide': '1000'}

CodePudding user response：

Using the way you've written this, you could make a list with every word and loop through all of it. It would look like this:

def read_file(filename):
    infile = open(filename, "r")

    element_list = ["Helium", "Neon 20", "Hydrogen", "Argon 40", "Neon 22", "Argon 36", "Methane", "Ammonia", "Carbon Dioxide"]
    data = []
    for line in infile:
        words = line.split()
    for element in elementlist:
        if words[0] == element:
            data.append({words[0]:words[3]})
    print(data)
    return
read_file("atm_moon.txt")

CodePudding user response：

You can solve it using next regular expression ((?:[A-Z][a-z]*\s*)*\d*)\s-\s([\d,] ).

Code:

import re

with open(r"path/to/file") as f:
    res = dict(re.findall(r"((?:[A-Z][a-z]*\s*)*\d*)\s-\s([\d,] )", f.read()))

Result:

{
    'Helium 4': '40,000',
    'Neon 20': '40,000',
    'Hydrogen': '35,000',
    'Argon 40': '30,000',
    'Neon 22': '5,000',
    'Argon 36': '2,000',
    'Methane': '1000',
    'Ammonia': '1000',
    'Carbon Dioxide': '1000'
}