Hi I have given below data but its unindent only using total keyword I can find the right nodes and can build tree structure. Input:
Current Assets
Cash
Checking 583961
Savings 224600
Petty Cash 89840
Total Cash 898402
Accounts Receivable 3593607
Work in Process 589791
Other Current Assets
Prepaid Rent 164593
Prepaid Liability Insurance 109728
Total Other Current Assets 274321
Total Current Assets 274321
I am looking for below Output:
{
"Current Assets": {
"Cash": {
"Checking": 583961,
"Savings": 224600,
"Petty Cash": 89840,
"Total Cash": 898402
},
"Accounts Receivable": 3593607,
"Work in Process": 589791,
"Other Current Assets": {
"Prepaid Rent": 164593,
"Prepaid Liability Insurance": 109728,
"Total Other Current Assets": 274321
},
"Total Current Assets": 5356121
}
}
I tried recursion and node concept but nothing worked, It will be great if someone can help me on that trying to achieve using Python.
Rules:
As an example :
Actually work in process
is not sub item of Account Receivable' Its item of
current asset only.
As "work in progress" have digit at its end hence no children of it.
As per input data Cash does not have any numeric value at end hence such entries will have child/children,
cash is ending once having total cash with numeric value.
There will not be any children of work in process or Accounts Receivable as they are ending with Numeric value at end
CodePudding user response:
You can do this with a recursive function or just use a stack to keep track of the nesting. The basic rule is:
- No number: increase nesting
- Starts with "Total": decrease nesting.
With a stack, it might look like:
import re
s = '''Current Assets
Cash
Checking 583961
Savings 224600
Petty Cash 89840
Total Cash 898402
Accounts Receivable 3593607
Work in Process 589791
Other Current Assets
Prepaid Rent 164593
Prepaid Liability Insurance 109728
Total Other Current Assets 274321
Total Current Assets 274321'''
def nest(items):
res = {}
stack = [res]
for item in items:
components = re.findall(r'(^.*?) (\d )', item)
if not components: # no numbers
cur = {}
stack[-1][item.strip()] = cur
stack.append(cur)
else:
label, nums = components[0]
stack[-1][label.strip()] = int(nums)
if label.startswith("Total"): # end of subdict
stack.pop()
return res
nest(s.split('\n'))
This will return:
{
'Current Assets': {
'Cash': {
'Checking': 583961,
'Savings': 224600,
'Petty Cash': 89840,
'Total Cash': 898402
},
'Accounts Receivable': 3593607,
'Work in Process': 589791,
'Other Current Assets': {
'Prepaid Rent': 164593,
'Prepaid Liability Insurance': 109728,
'Total Other Current Assets': 274321
},
'Total Current Assets': 274321
}
}