I have a list of addresses in a text document and I want to add a comma after address line 1, and then save it all to a new text document.
example my list of addresses are
Address 1404 756 48 Stockholm
Address 9 756 52 Stockholm
Address 53 B lgh 1001 619 34 Stockholm
Address 72 B lgh 1101 619 30 Stockholm
Address 52 A 619 33 Stockholm
What I want the output to be
Address 1404, 756 48 Stockholm
Address 9, 756 52 Stockholm
Address 53 B lgh 1001, 619 34 Stockholm
Address 72 B lgh 1101, 619 30 Stockholm
Address 52 A, 619 33 Stockholm
I can't figure out how to accurately place the comma at the right place (before the zip code) since the amount of whitespace isn't the same for all addresses. The zip code consists of 5 digits for instance (756 48).
CodePudding user response:
With regexp
import re
re.sub(r'\s(\d{3}\s\d{2}\s.*)$', ', \\1', 'Address 53 B lgh 1001 619 34 Stockholm')
# 'Address 53 B lgh 1001, 619 34\xa0Stockholm'
(the \xa0
is part of your strings. It will print as a space)
CodePudding user response:
You can use str.rstrip()
to separate city and index from the first line:
file_line = 'Address 72 B lgh 1101 619 30 Stockholm'
# Use maxsplit=3
first_line, *second_line = address.rsplit(' ', 3)
new_address = f'{first_line}, {' '.join(second_line)}'
CodePudding user response:
s='''Address 1404 756 48 Stockholm
Address 9 756 52 Stockholm
Address 53 B lgh 1001 619 34 Stockholm
Address 72 B lgh 1101 619 30 Stockholm
Address 52 A 619 33 Stockholm'''
lis=s.split('\n')
lis1=[i.split(' ') for i in lis]
for i in lis1:
for j in i:
if i.index(j) == 1:
lis1[lis1.index(i)][i.index(j)] = j ','
lis1=[' '.join(i) for i in lis1]
lis1='\n'.join(lis1)
print(lis1)
CodePudding user response:
If your postal code and city are limited to the format shown here (three digits, space, two digits, space, one word), then you can do this with a regular expression. It just needs to look for the postal code and city pattern, and capture both it and whatever comes before it, adding a comma in between:
>>> import re
>>> address_list = ['Address 1404 756 48 Stockholm', 'Address 9 756 52 Stockholm',
'Address 53 B lgh 1001 619 34 Stockholm', 'Address 72 B lgh 1101 619 30 Stockholm',
'Address 52 A 619 33 Stockholm']
# Define regex pattern as space 3digits space 2digits word.
# Parens capture both that pattern and everything before it (.*)
>>> p = re.compile(r"(.*)( \d{3} \d{2} \w )")
# Create a new list, replacing each item with group1 comma group2
>>> new_addresses = [re.sub(p, r"\1,\2", a) for a in address_list]
>>> for a in new_addresses: print(a)
'Address 1404, 756 48 Stockholm'
'Address 9, 756 52 Stockholm'
'Address 53 B lgh 1001, 619 34 Stockholm'
'Address 72 B lgh 1101, 619 30 Stockholm'
'Address 52 A, 619 33 Stockholm'
If the patterns can be different than the examples you gave here (if a city can have two words, etc.) then you can still do this, you would just need to tweak the regex to match those patterns too.
CodePudding user response:
Try this:
with open('sample.txt') as file:
df = file.read()
for line in df.split('\n'):
split_line = line.split()
split_line.insert(-3, ',')
new_line = " ".join(elem for elem in split_line).strip()
print(new_line)
Output:
Address 1404 , 756 48 Stockholm
Address 9 , 756 52 Stockholm
Address 53 B lgh 1001 , 619 34 Stockholm
Address 72 B lgh 1101 , 619 30 Stockholm
Address 52 A , 619 33 Stockholm
The extra space before the commas can be adressed with a strip()
further down.