I have this data:
Event|Time|Meta|Meet|Date
50 Y Free|22.30|U|IL NASA Winter Blast Off|Nov 30, 2018
100 Y Free|55.50|U|Greg's Super Splash|Jun 16, 2011
50 Y Breast|27.07|X|CCIW Swimming Championships|Feb 10, 2006
50 L Breast|33.70||1st Cyprus International Masters Swim Meet|Oct 22, 2016
100 Y Breast|58.97||CCIW Swimming Championships|Feb 10, 2006
100 L Breast|1:14.24||42nd Bulgarian Masters Championship|Sep 07, 2020
200 Y Breast|2:06.45|U|Greg's Super Splash|Jun 16, 2011
200 L Breast|2:47.48||1st Cyprus International Masters Swim Meet|Oct 22, 2016
200 Y IM|2:03.71||CCIW Swimming Championships|Feb 09, 2006
200 S IM|2:39.60||3rd International Masters Tournament - Rodopa Smolyan|Apr 02, 2022
200 L IM|2:39.29||1st Cyprus International Masters Swim Meet|Oct 22, 2016
and I want to separate the data by "|" while making sure that each column lines up with it's longest "element" like this:
Event Time Meta Meet Date
50 Y Free 22.30 U IL NASA Winter Blast Off Nov 30, 2018
100 Y Free 55.50 U Greg's Super Splash Jun 16, 2011
50 Y Breast 27.07 X CCIW Swimming Championships Feb 10, 2006
50 L Breast 33.70 1st Cyprus International Masters Swim Meet Oct 22, 2016
100 Y Breast 58.97 CCIW Swimming Championships Feb 10, 2006
100 L Breast 1:14.24 42nd Bulgarian Masters Championship Sep 07, 2020
200 Y Breast 2:06.45 U Greg's Super Splash Jun 16, 2011
200 L Breast 2:47.48 1st Cyprus International Masters Swim Meet Oct 22, 2016
200 Y IM 2:03.71 CCIW Swimming Championships Feb 09, 2006
200 S IM 2:39.60 3rd International Masters Tournament - Rodopa Smolyan Apr 02, 2022
200 L IM 2:39.29 1st Cyprus International Masters Swim Meet Oct 22, 2016
So I've made an algorithm in python that splits the data into rows and then splits that data into columns, finds the longest line for each column and then writes each entry to the output file with spacing as needed:
def beautify_data(swimmer_name):
with open(f"assets\\data\\{swimmer_name}.txt", "r", encoding="UTF-8") as src:
data = src.read()
rows = data.split()
with open(f"assets\\data\\{swimmer_name}.txt", "w", encoding="UTF-8") as dst:
for row in rows:
entry = ""
max_length = 0
cols = row.split("|")
for col in cols:
max_length = col.__len__() if col.__len__() > max_length else max_length
for col in cols:
entry = col.strip()
for i in range(0, max_length):
entry = " "
dst.write(f"{entry.strip()}\n")
return f"{swimmer_name}.txt"
For some reason though, the output isn't as expected. I get this:
Event Time Meta Meet Date
50
Y
Free 22.30 U IL
NASA
Winter
Blast
Off Nov
30,
2018
100
Y
Free 55.50 U Greg's
Super
Splash Jun
16,
2011
50
Y
Breast 27.07 X CCIW
etc.
Instead of what I wanted. To me, it seems that each column is being split by whitespace for some reason, which isn't explicit in the code. Conceptually, the code makes sense though. Does anyone know why I don't get the desired output? Thanks!
CodePudding user response:
rows = data.split()
splits the entire file into a list of words. str.split()
with no argument splits on whitespace. You probably want data = src.readlines()
in the line above, which reads the file line by line into a list.
def beautify_data(swimmer_name):
with open(f"assets\\data\\{swimmer_name}.txt", "r", encoding="UTF-8") as src:
rows = src.readlines()
with open(f"assets\\data\\{swimmer_name}.txt", "w", encoding="UTF-8") as dst:
...