Home > front end >  How to handle regex when it doesn't find anything (Python)
How to handle regex when it doesn't find anything (Python)

Time:12-16

I am searching for values within several documents to create different databases for each parameter. "groups["BRICK"]" contains all documents in text format.

a_dict = ['RHO','CE','LAMBDA','THETA_POR','THETA_EFF','THETA_CAP','THETA_80','AW','MEW','KLEFF']

Brick_par = []

for bricks in groups["BRICK"]:
    for par in a_dict:
        file = open(bricks, 'r', encoding='latin-1')
        file_txt = file.read() #leggo il file
        regex = '((' (par) ') )\s =\s ([0-9] .?[0-9] )'
        searched = re.search(regex, file_txt) #cerco la riga da modificare
        Brick_par.append(searched[3])
Brick_par = pd.DataFrame({str(par):Brick_par})

If instead of using the loop I use just a few parameters individually (e.g. CE) the script works. This is because some documents do not contain certain parameters.

I would like to know if there is a way to "ignore" all the values for which regex does not find anything in the document. That way I can probably solve it.

Also, my goal would be to create a single dataframe with all the parameters found. But that's a later step.

The error I get is:

TypeError: 'NoneType' object is not subscriptable

As suggested by diggusbickus:

a_dict = ['RHO','CE','LAMBDA','THETA_POR','THETA_EFF','THETA_CAP','THETA_80','AW','MEW','KLEFF']

Brick_par = []

for bricks in groups["BRICK"]:
    for par in a_dict:
        file = open(bricks, 'r', encoding='latin-1')
        file_txt = file.read() #leggo il file
        regex = '((' (par) ') )\s =\s ([0-9] .?[0-9] )'
        searched = re.search(regex, file_txt)
        if not searched: continue
        Brick_par.append(searched[3])
        file.close()

Brick_par = pd.DataFrame({str(par):Brick_par})

My goal would be to create a dataframe with all the results for each parameter. Thank you for your availability.

CodePudding user response:

you should make brick_par a dict in the first place, because that's what you want to give to pandas

import pandas as pd
import re
a_dict = ['RHO','CE','LAMBDA','THETA_POR','THETA_EFF','THETA_CAP',
        'THETA_80','AW','MEW','KLEFF']

brick_par = {k: [] for k in a_dict}
for bricks in groups["BRICK"]:
    for par in a_dict:
        with open(bricks, 'r', encoding='latin-1') as f:
            file_txt = f.read() #leggo il file
        regex = '((' (par) ') )\s =\s ([0-9] .?[0-9] )'
        searched = re.search(regex, file_txt)
        if not searched: 
            brick_par[par].append(None)
        else:
            brick_par[par].append(searched[3])

brick_par = pd.DataFrame(brick_par)
  • Related