Home > Blockchain >  How to store string in quotation that contains two words?
How to store string in quotation that contains two words?

Time:07-20

I wrote the search code and I want to store what is between " " as one place in the list, how I may do that? In this case, I have 3 lists but the second one should is not as I want.

import re

message='read read read'

others = ' '.join(re.split('\(.*\)', message))
others_split = others.split()

to_compile = re.compile('.*\((.*)\).*')
to_match = to_compile.match(message)
ors_string = to_match.group(1)

should = ors_string.split(' ')

must = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w ))', message) if term and not term.startswith('-')]

must_not = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w ))', message) if term and term.startswith('-')]
must_not = [s.replace("-", "") for s in must_not]

print(f'must: {must}')
print(f'should: {should}')
print(f'must_not: {must_not}')

Output:

must: ['read', '"find find"', 'within', '"plane"']
should: ['"exactly', 'needed"', 'empty']
must_not: ['russia', '"destination good"']

Wanted result:

must: ['read', '"find find"', 'within', '"plane"']
should: ['"exactly needed"', 'empty'] <---
must_not: ['russia', '"destination good"']

Error when edited the message, how to handle it?

Traceback (most recent call last):
    ors_string = to_match.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

CodePudding user response:

Your should list splits on whitespace: should = ors_string.split(' '), this is why the word is split in the list. The following code gives you the output you requested but I'm not sure that is solves your problem for future inputs.

import re

message = 'read "find find":within("exactly needed" OR empty) "plane" -russia -"destination good"'

others = ' '.join(re.split('\(.*\)', message))
others_split = others.split()

to_compile = re.compile('.*\((.*)\).*')
to_match = to_compile.match(message)
ors_string = to_match.group(1)

# Split on OR instead of whitespace.
should = ors_string.split('OR')
to_remove_or = "OR"
while to_remove_or in should:
    should.remove(to_remove_or)

# Remove trailing whitespace that is left after the split.
should = [word.strip() for word in should]

must = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w ))', message) if term and not term.startswith('-')]

must_not = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w ))', message) if term and term.startswith('-')]
must_not = [s.replace("-", "") for s in must_not]

print(f'must: {must}')
print(f'should: {should}')
print(f'must_not: {must_not}')

  • Related