I need to extract the words inside (
)
for a perticular catagory.
I need to find the words inside (
)
in
Technology (Pyrolysis, Gasification)
Also the same for words inside (
)
in
Application (Agriculture, Animal Feed, Health & Beauty Products)
This is my string:
string = ''' U.S. Biochar Market Size, Share & Trends Analysis Report By Technology (Pyrolysis, Gasification), By Application (Agriculture, Animal Feed, Health & Beauty Products), By State, And Segment Forecasts'''
I've written the function but it's not working for all instance for some instance its also fetching
By state, and segment forecast
which is not needed.
Eg.
Application = 'Agriculture, Animal Feed, Health & Beauty Products'
Technology = 'Pyrolysis, Gasification'
For a large data set of similar sentences using python programming.
enter code here def check_para(args):
s = args
start = s.find('Technology (')
end = s.find('),', start)
techno = s[start:end][len('Technology ('):]
s = args
start = s.find('Material (')
end = s.find('),', start)
mat = s[start:end][len('Material ('):]
s = args
start = s.find('Product (')
end = s.find('),', start)
prod = s[start:end][len('Product ('):]
s = args
start = s.find('Service (')
end = s.find('),', start)
serv = s[start:end][len('Service ('):]
s = args
start = s.find('Type (')
end = s.find('),', start)
typ = s[start:end][len('Type ('):]
s = args
start = s.find('Form (')
end = s.find('),', start)
form = s[start:end][len('Form ('):]
s = args
start = s.find('Application (')
end = s.find('),', start)
appli = s[start:end][len('Application ('):]
s = args
start = s.find('End Use (')
end = s.find('),', start)
enduse = s[start:end][len('End Use ('):]
s = args
start = s.find('Derivative Grades (')
end = s.find('),', start)
deriv = s[start:end][len('Derivative Grades ('):]
type1 = deriv form typ serv prod techno mat
application = appli enduse
if len(application) > 0 :
application = application.replace(', ', '\n')
else:
application = 'Application I\nApplication II\nApplication III\n'
if len(type1) > 0:
type1 = type1.replace(', ', '\n')
else:
type1 = 'Type I\nTypeII\nType III\n'
return application, type1
CodePudding user response:
Using re.findall
we can try:
string = ''' U.S. Biochar Market Size, Share & Trends Analysis Report By Technology (Pyrolysis, Gasification), By Application (Agriculture, Animal Feed, Health & Beauty Products), By State, And Segment Forecasts'''
matches = re.findall(r'(\w ) \((.*?)\)', string)
for match in matches:
print(match[0] ' = ' "'" match[1] "'")
This prints:
Technology = 'Pyrolysis, Gasification'
Application = 'Agriculture, Animal Feed, Health & Beauty Products'