I want to return the string representing the list of the new names of all the photos in the same order as the original string. However, my final_string
is currently in a different order.
def fetch_date_time(photo):
return photo.split(", ")[2]
def prefixed_number(n, max_n):
len_n = len(str(n))
len_max_n = len(str(max_n))
prefix = "".join(["0" for i in range(len_max_n - len_n)]) str(n)
return prefix
def solution(S):
list_of_pics = S.split("\n")
city_dict = {}
for pic in list_of_pics:
city = pic.split(", ")[1]
if city in city_dict:
city_dict[city].append(pic)
else:
city_dict[city] = [pic]
final_string = ""
for city_group in city_dict:
city_dict[city_group].sort(key=fetch_date_time)
for ind, photo in enumerate(city_dict[city_group]):
city = photo.split(",")[1]
ext = photo.split(", ")[0].split(".")[-1]
max_len = len(city_dict[city_group])
number = prefixed_number(ind 1, max_len)
city_dict[city_group][ind] = city number "." ext "\n"
final_string = "".join(city_dict[city_group])
return final_string
string = """photo.jpg, Warsaw, 2013-09-05 14:08:15
john.png, London, 2015-06-20 15:13:22
myFriends.png, Warsaw, 2013-09-05 14:07:13
Eiffel.jpg, Paris, 2015-07-23 08:03:02
pisatower.jpg, Paris, 2015-07-22 23:59:59
BOB.jpg, London, 2015-08-05 00:02:03
notredame.png, Paris, 2015-09-01 12:00:00
me.jpg, Warsaw, 2013-09-06 15:40:22
a.png, Warsaw, 2016-02-13 13:33:50
b.jpg, Warsaw, 2016-01-02 15:12:22
c.jpg, Warsaw, 2016-01-02 14:34:30
d.jpg, Warsaw, 2016-01-02 15:15:01
e.png, Warsaw, 2016-01-02 09:49:09
f.png, Warsaw, 2016-01-02 10:55:32
g.jpg, Warsaw, 2016-02-29 22:13:11"""
print(solution(string))
My current output:
Warsaw01.png
Warsaw02.jpg
Warsaw03.jpg
Warsaw04.png
Warsaw05.png
Warsaw06.jpg
Warsaw07.jpg
Warsaw08.jpg
Warsaw09.png
Warsaw10.jpg
London1.png
London2.jpg
Paris1.jpg
Paris2.jpg
Paris3.png
Expected output:
Warsaw02.jpg
London1.png
Warsaw01.png
Paris2.jpg
Paris1.jpg
London2.jpg
Paris3.png
Warsaw03.jpg
Warsaw09.png
Warsaw07.jpg
Warsaw06.jpg
Warsaw08.jpg
Warsaw04.png
Warsaw05.png
Warsaw10.jpg
CodePudding user response:
Below code may help.
string = """photo.jpg, Warsaw, 2013-09-05 14:08:15
john.png, London, 2015-06-20 15:13:22
myFriends.png, Warsaw, 2013-09-05 14:07:13
Eiffel.jpg, Paris, 2015-07-23 08:03:02
pisatower.jpg, Paris, 2015-07-22 23:59:59
BOB.jpg, London, 2015-08-05 00:02:03
notredame.png, Paris, 2015-09-01 12:00:00
me.jpg, Warsaw, 2013-09-06 15:40:22
a.png, Warsaw, 2016-02-13 13:33:50
b.jpg, Warsaw, 2016-01-02 15:12:22
c.jpg, Warsaw, 2016-01-02 14:34:30
d.jpg, Warsaw, 2016-01-02 15:15:01
e.png, Warsaw, 2016-01-02 09:49:09
f.png, Warsaw, 2016-01-02 10:55:32
g.jpg, Warsaw, 2016-02-29 22:13:11"""
class row:
def __init__(self, image, city, date):
self.image=image
self.city=city
self.date=date
def read_rows(text):
rows=[]
for line in text.split('\n'):
image,city,date=line.split(',')
rows.append(row(image,city,date))
return rows
def rename_city(rows):
known_cities={}
for row in rows:
if row.city in known_cities:
known_cities[row.city] =1
row.city="%sd"%(row.city,known_cities[row.city])
else:
known_cities[row.city]=1
row.city ="01"
def get_citynames(rows):
cities=[]
for row in rows:
cities.append(row.city)
return cities
def solution(input):
rows=read_rows(input)
sorted_rows=sorted(rows, key=lambda x: x.date)
rename_city(sorted_rows)
return get_citynames(rows)
print("\n".join(solution(string)))
Output
Warsaw02
London01
Warsaw01
Paris02
Paris01
London02
Paris03
Warsaw03
Warsaw09
Warsaw07
Warsaw06
Warsaw08
Warsaw04
Warsaw05
Warsaw10
CodePudding user response:
To solve this problem you need:
- Group your data by city;
- Sort entries belong to same city by date;
- Generate new filenames and get back to original order.
First of all, we need to split each line of your string
by ", "
:
lines = [s.split(", ") for s in string.splitlines()]
To group our
lines
by city we can use two different methods:1.1. Make a dictionary where city will be a unique key and value will be list of all lines with this city:
grouped_photos = {} for line in lines: city = line[1] if city in grouped_photos: grouped_photos[city].append(line) else: grouped_photos[city] = [line]
Here you can notice that there's no sense to generate
lines
if proceed with this method as it leads to one useless iteration, we can iterate overstring.splitlines()
:grouped_photos = {} for line in string.splitlines(): splitted = line.split(", ") city = splitted[1] if city in grouped_photos: grouped_photos[city].append(splitted) else: grouped_photos[city] = [splitted]
Also we can shorten code a bit using
defaultdict
:from collections import defaultdict ... grouped_photos = defaultdict(list) for line in string.splitlines(): splitted = line.split(", ") grouped_photos[splitted[1]].append(splitted)
1.2. Use
groupby()
. The main difference from previous method is thatgroupby()
requires sorted data.from itertools import groupby from operator import itemgetter ... lines.sort(key=itemgetter(1)) grouped_photos = {c: list(p) for c, p in groupby(lines, itemgetter(1))}
I've used dict comprehension only as temporary storage of
groupby()
return, we won't need it later.Now we need to sort every list with same city by date. The common way to compare dates stored in string (which is necessary for sorting) is to initialize
datetime
object using some format withdatetime.strptime()
or withdatetime.fromisoformat()
if string matches standard format.from datetime import datetime ... grouped_photos["Warsaw"].sort(key=lambda x: datetime.fromisoformat(x[2]))
But with format you have we can also exploit lexicographic_order which python uses to compare sequences (string is sequence too). It means that we don't need to modify our date string just leave it as it is.
grouped_photos["Warsaw"].sort(key=itemgetter(2))
So, basically we need to sort every value in
grouped_photos
:for value in grouped_photos.values(): value.sort(key=itemgetter(2))
To generate new filenames and put them in original order firstly we need to store original list index. For this we should modify initial data split to include also an index of line:
lines = [s.split(", ") [i] for i, s in enumerate(string.splitlines())]
Size of our result list will be exactly the same as in source, so to not use sorting again we can initialize result list as list on
None
values with same length withlines
, then iterate overgrouped_photos
and save generated filename to initial index.To generate filename we need name of city, index in sorted list and original file extension. To extract file extension from filename we can use
splitext()
or simply callstr.rsplit()
:from os.path import splitext ext = splitext("pisatower.jpg")[1] # OR ext = "." "pisatower.jpg".rsplit(".", 1)[1]
Let's restore original order and set new filenames:
from os.path import splitext ... result = [None] * len(lines) for photos in grouped_photos.values(): for i, (name, city, _, index) in enumerate(photos, 1): result[index] = f"{city}{i}{splitext(name)[1]}"
The only thing left is zero-padding of index. Length of list is a maximum index, so maximum width we can obtain using string length of length of each list. There are plenty of ways to pad number, I'll use extended format syntax in this example:
for photos in grouped_photos.values(): padding = len(str(len(photos))) for i, (name, city, _, index) in enumerate(photos, 1): result[index] = f"{city}{i:0{padding}}{splitext(name)[1]}"
Now we need to combine all together. Using common sense and basic knowledge about loops we can combine code above with certain optimizations:
from operator import itemgetter
from itertools import groupby
from os.path import splitext
string = """photo.jpg, Warsaw, 2013-09-05 14:08:15
john.png, London, 2015-06-20 15:13:22
myFriends.png, Warsaw, 2013-09-05 14:07:13
Eiffel.jpg, Paris, 2015-07-23 08:03:02
pisatower.jpg, Paris, 2015-07-22 23:59:59
BOB.jpg, London, 2015-08-05 00:02:03
notredame.png, Paris, 2015-09-01 12:00:00
me.jpg, Warsaw, 2013-09-06 15:40:22
a.png, Warsaw, 2016-02-13 13:33:50
b.jpg, Warsaw, 2016-01-02 15:12:22
c.jpg, Warsaw, 2016-01-02 14:34:30
d.jpg, Warsaw, 2016-01-02 15:15:01
e.png, Warsaw, 2016-01-02 09:49:09
f.png, Warsaw, 2016-01-02 10:55:32
g.jpg, Warsaw, 2016-02-29 22:13:11"""
lines = [s.split(", ") [i] for i, s in enumerate(string.splitlines())]
lines.sort(key=itemgetter(1, 2))
result = [None] * len(lines)
for city, [*photos] in groupby(lines, itemgetter(1)):
padding = len(str(len(photos)))
for i, (name, _, _, index) in enumerate(photos, 1):
result[index] = f"{city}{i:0{padding}}{splitext(name)[1]}"
I've noticed that you haven't used any import in your code, maybe it's some weird requirement, so here is same code without imports and syntax sugar:
string = """photo.jpg, Warsaw, 2013-09-05 14:08:15
john.png, London, 2015-06-20 15:13:22
myFriends.png, Warsaw, 2013-09-05 14:07:13
Eiffel.jpg, Paris, 2015-07-23 08:03:02
pisatower.jpg, Paris, 2015-07-22 23:59:59
BOB.jpg, London, 2015-08-05 00:02:03
notredame.png, Paris, 2015-09-01 12:00:00
me.jpg, Warsaw, 2013-09-06 15:40:22
a.png, Warsaw, 2016-02-13 13:33:50
b.jpg, Warsaw, 2016-01-02 15:12:22
c.jpg, Warsaw, 2016-01-02 14:34:30
d.jpg, Warsaw, 2016-01-02 15:15:01
e.png, Warsaw, 2016-01-02 09:49:09
f.png, Warsaw, 2016-01-02 10:55:32
g.jpg, Warsaw, 2016-02-29 22:13:11"""
grouped_photos = {}
for i, line in enumerate(string.splitlines()):
splitted = line.split(", ") [i]
city = splitted[1]
if city in grouped_photos:
grouped_photos[city].append(splitted)
else:
grouped_photos[city] = [splitted]
result = [None] * (i 1)
for photos in grouped_photos.values():
photos.sort(key=lambda x: x[2])
padding = len(str(len(photos)))
for i, (name, city, _, index) in enumerate(photos, 1):
result[index] = city str(i).zfill(padding) "." name.rsplit(".", 1)[1]
Add print(*result, sep="\n")
to any of versions to get output in console.
Output:
Warsaw02.jpg
London1.png
Warsaw01.png
Paris2.jpg
Paris1.jpg
London2.jpg
Paris3.png
Warsaw03.jpg
Warsaw09.png
Warsaw07.jpg
Warsaw06.jpg
Warsaw08.jpg
Warsaw04.png
Warsaw05.png
Warsaw10.jpg
You can help my country, check my profile info.