I am trying to extract words only from the body of the function. Below you can see my text.
# Example of estimation
## Example of estimation
### Example of estimation
"Some calculation"
""" Note :
The data here is artificial.
Idea is to show how code will look like after estimation.
More information www.google.com
"""
@iterate_jit(nopython=True)
def fun_min_ssc(min_wage, tax_rate,calc_min_profit):
calc_min_profit= min_wage * tax_rate min_wage - (min_wage*2)
return calc_min_profit
Text that starting with : #,##,###,”,""", @ is not needed.
Now I want to extract only arguments from the body of the function such as :
- Name of the function: fun_min_ssc and
- Arguments of the function : min_wage, tax_rate,calc_min_profit
I tried to solve this problem with the function below :
f= open("text.txt","w ")
f.write('''# Example of estimation
## Example of estimation
### Example of estimation
"Some calculation"
""" Note :
The data here is artificial.
Idea is to show how code will look like after estimation.
More information www.google.com
"""
@iterate_jit(nopython=True)
def cal_min_ssc(min_wage, tax_rate,min_profit):
min_profit = min_wage * tax_rate min_wage - (min_wage*2)
return min_profit
''')
for line in f.readlines():
print(line, end='')
f.close()
os.getcwd()
os.listdir()
os.chdir('C:/') <---Import your path
file_reader = open('C:/text.txt') <----Import your path
os.getcwd()
# Open the file in read mode
text = open("text.txt", "r")
# Creating dictonary and count freqency
d = dict()
# Loop through each line of the file
for line in text:
# Remove the leading spaces and newline character
line = line.strip()
# Convert the characters in line to
# lowercase to avoid case mismatch
line = line.lower()
# Split the line into words
words = line.split(" ")
words = line.split(",")
words = line.split("*")
# Iterate over each word in line
for word in words:
# Check if the word is already in dictionary
if word in d:
# Increment count of word by 1
d[word] = d[word] 1
else:
# Add the word to dictionary with count 1
d[word] = 1
# Print the contents of dictionary
for key in list(d.keys()):
print(key, ":", d[key])
So can anybody help me how to solve this problem or suggest some other approach that can solve this problem ?
CodePudding user response:
This might get you on the right track. I have used a regex statement as a specific search criteria to find the lines that start with def
and end with :
.
x = re.search(r"^def.*:$", line)
Once I have the line in question, I split the line using def
and the opening bracket of the function (
. This allows me to easily grab the function name.
values = x[0].split('def ')[1].split('(')
function_name = values[0]
I then have to grab the other section, but remove the last two characters ie. ):
arguments = values[1][:-2].split(', ')
As the arguments are separated by a comma, I can then use that as a split separator. However, I must warn you, make sure they are consistently separated in the same way...i.e. with or without a space after the comma.
I have printed the desired output, however, you can add these items to a list or whatever structure you desire:
Here is my example code (without all the file input stuff):
import re
text = '''# Example of estimation
## Example of estimation
### Example of estimation
"Some calculation"
""" Note :
The data here is artificial.
Idea is to show how code will look like after estimation.
More information www.google.com
"""
@iterate_jit(nopython=True)
def cal_min_ssc(min_wage, tax_rate, min_profit):
min_profit = min_wage * tax_rate min_wage - (min_wage*2)
return min_profit
'''
lines = text.split('\n')
for line in lines:
x = re.search(r"^def.*:$", line)
if x != None:
values = x[0].split('def ')[1].split('(')
function_name = values[0]
arguments = values[1][:-2].split(', ')
print(f"Function Name: {function_name}")
print(f"Arguments: {arguments}")
OUTPUT:
Function Name: cal_min_ssc
Arguments: ['min_wage', 'tax_rate', 'min_profit']