Home > front end >  Python reading the first entry of a paranthesis in a series of paranthesis
Python reading the first entry of a paranthesis in a series of paranthesis

Time:04-17

I have thousands of lines of the following sample.

"[('entryA', 'typeA'), ('entryB', 'typeB'), ('entryC', 'typeC'), ('entryD', 'typeD')]"

My question is how to extract the first entry of each parenthesis and put it in the following format?

"entries" : ["entryA", "entryB", "entryC", "entryD"]

My code:

s = "[('entryA', 'typeA'), ('entryB', 'typeB'), ('entryC', 'typeC'), ('entryD', 'typeD')]"
result = re.findall('\(\'.*?,', s)

print("\"entries\":",result)

Current output:

"entries": ["('entryA',", "('entryB',", "('entryC',", "('entryD',"]

CodePudding user response:

You need to use lookahead and lookbehind regexs to do the following

s = "[('entryA', 'typeA'), ('entryB', 'typeB'), ('entryC', 'typeC'), ('entryD', 'typeD')]"
result = re.findall("(?<=\(').*?(?=',)", s)

print("\"entries\":",result)

Lookahead: (?=EXPR) looks what is directly ahead the element.
Lookbehind: (?<=EXPR) looks what is directly behind the element.

CodePudding user response:

Here's a better way.

import ast
s = ast.literal_eval(s)
entries = [a[0] for a in s]

CodePudding user response:

you don't need re, use ast.literal_eval

>>> s = "[('entryA', 'typeA'), ('entryB', 'typeB'), ('entryC', 'typeC'), ('entryD', 'typeD')]"
>>> from ast import literal_eval
>>> literal_eval(s)
[('entryA', 'typeA'), ('entryB', 'typeB'), ('entryC', 'typeC'), ('entryD', 'typeD')]
>>> out = [i[0] for i in literal_eval(s)]
>>> out
['entryA', 'entryB', 'entryC', 'entryD']
  • Related