[{'id': 2, 'Registered Address': 'Line 1: 1 Any Street Line 2: Any locale City: Any City Region / State: Any Region Postcode / Zip code: BA2 2SA Country: GB Jurisdiction: Any Jurisdiction'}]
I have the above read into a dataframe and that is the output so far. The issue is I need to break out the individual elements - due to names of places etc the values may or may not have spaces in them - looking at the above my keys are Line 1, Line 2, City, Region / State, Postcode / Zip, Country, Jurisdiction.
Output required for the "Registered Address"-'key'is the keys and values
"Line 1": "1 Any Street"
"Line 2": "Any locale"
"City": "Any City"
"Region / State": "Any Region"
"Postcode / Zip code": "BA2 2SA"
"Country": "GB"
"Jurisdiction": "Any Jurisdiction"
Just struggling to find a way to get to the end result.I have tried to pop out and use urllib.prse but fell short - is anypone able to point me in the best direction please?
CodePudding user response:
Tried to write a code that generalizes your question, but there were some limitations, regarding your data format. Anyway I would do this:
def address_spliter(my_data, my_keys):
address_data = my_data[0]['Registered Address']
key_address = {}
for i,k in enumerate(keys):
print(k)
if k == 'Jurisdiction:':
key_address[k] = address_data.split('Jurisdiction:')[1].removeprefix(' ').removesuffix(' ')
else:
key_address[k] = address_data.split(k)[1].split(keys[i 1])[0].removeprefix(' ').removesuffix(' ')
return key_address
were you can call this function like this:
my_data = [{'id': 2, 'Registered Address': 'Line 1: 1 Any Street Line 2: Any locale City: Any City Region / State: Any Region Postcode / Zip code: BA2 2SA Country: GB Jurisdiction: Any Jurisdiction'}]
and
my_keys = ['Line 1:','Line 2:','City:', 'Region / State:', 'Postcode / Zip code:', 'Country:', 'Jurisdiction']
As you can see It'll work if only the sequence of keys is not changed. But anyway, you can work around this idea and change it base on your problem accordingly if it doesn't go as expected.