I need to return the next invoice_index_end as long as it comes after the first start index, but I can't seem to find a way to intelligently do this. What happens at the moment, is if there is an end to an invoice BEFORE the start of the target invoice, it will find that one and return its index position. Is there an if or while function I could use to loop through the index locations of until it is greater than the start index?
Added context for what my initial list is. Thelst_file is quite literally just a list filled with lines of a file.
lst_file = []
def assign_file_lines_to_list():
for invoices in file:
lst_file.append(invoices)
return lst_file
assign_file_lines_to_list()
file.close()
#Opens the requested error log
error_lst = []
file_name = input("Enter the full name of the error log. \n")
error_file = open(file_name, "r")
#Putserror log file into a list word by word.
for eachWord in error_file:
error_lst.extend(eachWord.split())
#Converts all string values to integers
def str_to_int():
for i in range(0, len(error_lst)):
try:
error_lst[i] = int(error_lst[i])
except:
continue
return error_lst
#Everything that is converted to an integer is added to a new list
int_lst = []
for eachword in str_to_int():
if type(eachword) == int and eachword > 999:
int_lst.append(eachword)
#And then turned back into a string.
def int_to_str():
for i in range(0, len(int_lst)):
try:
int_lst[i] = str(int_lst[i])
except:
print("Error converting integer to a string!")
int_to_str()
#Standardizes all invoice to 10 digits
str_lst = [str(item).zfill(10) for item in int_lst]
print(str_lst)
#Finds the index of the invoice start and invoice end
new_lst = []
invoice_index_start = 0
invoice_index_end = 0
constant = '</Invoice>\n'
for i in str_lst: #integer
invoice_index_start = lst_file.index('<InvoiceNumber>' i '</InvoiceNumber>\n')
while str_lst.index(constant) > invoice_index_start:
invoice_index_end = lst_file.index(constant)
#if invoice_index_end <= invoice_index_start:
#Copies everything between start and end index into new list to be deleted from original list later
new_lst = lst_file[(invoice_index_start - 1):(invoice_index_end 1)]
CodePudding user response:
It appears that you're trying to slice an iterable from the first occurrence of of some value, up to but not including the next instance of the same value?
For example, if the content of your file was something like:
<someValue>a</someValue>
<InvoiceNumber>1</InvoiceNumber>
<someValue>b</someValue>
<InvoiceNumber>2</InvoiceNumber>
<someValue>c</someValue>
<InvoiceNumber>1</InvoiceNumber>
<someValue>d</someValue>
<InvoiceNumber>3</InvoiceNumber>
<someValue>d</someValue>
Here, you would be looking to capture:
<InvoiceNumber>1</InvoiceNumber>
<someValue>b</someValue>
<InvoiceNumber>2</InvoiceNumber>
<someValue>c</someValue>
If that's not correct, you should provide some sample data in your question and make it really clear what exactly you are trying to select.
Simplifying the question, here's an example showing a possible solution:
data = [1, 2, 5, 3, 8, 2, 1, 2, 7, 3]
# select everything from the first `2`, up to but not including the next
print(data[data.index(2):data.index(2, data.index(2) 1)]) # [2, 5, 3, 8]
# same for `3`
print(data[data.index(3):data.index(3, data.index(3) 1)]) # [3, 8, 2, 1, 2, 7]
In your code, it's likely something like:
result = lst_file[
lst_file.index(constant):lst_file.index(constant, lst_file.index(constant) 1)
]
However, that's assuming your lst_file
is an iterable. We can't tell, because you didn't include its definition in your example code.
If it isn't (but your use of .index()
suggests it probable is), this works:
import itertools
result = itertools.islice(
lst_file,
lst_file.index(constant),
lst_file.index(constant, lst_file.index(constant) 1
)
CodePudding user response:
I ended up racking my brain and tried a couple more things, and this is what I ended up with:
start_indexes = []
for i in str_lst:
invoice_index_start = lst_file.index('<InvoiceNumber>' i '</InvoiceNumber>\n')
start_indexes.append(invoice_index_start)
end_indexes = []
constant = '</Invoice>\n'
for i in range(0,len(start_indexes)):
invoice_index_end = lst_file.index(constant, start_indexes[i])
end_indexes.append(invoice_index_end 1)
I'm still appalled this worked.