How can I convert text to columns with Python?-CodePudding

I have this data:

Event|Time|Meta|Meet|Date
50 Y Free|22.30|U|IL NASA Winter Blast Off|Nov 30, 2018
100 Y Free|55.50|U|Greg's Super Splash|Jun 16, 2011
50 Y Breast|27.07|X|CCIW Swimming Championships|Feb 10, 2006
50 L Breast|33.70||1st Cyprus International Masters Swim Meet|Oct 22, 2016
100 Y Breast|58.97||CCIW Swimming Championships|Feb 10, 2006
100 L Breast|1:14.24||42nd Bulgarian Masters Championship|Sep 07, 2020
200 Y Breast|2:06.45|U|Greg's Super Splash|Jun 16, 2011
200 L Breast|2:47.48||1st Cyprus International Masters Swim Meet|Oct 22, 2016
200 Y IM|2:03.71||CCIW Swimming Championships|Feb 09, 2006
200 S IM|2:39.60||3rd International Masters Tournament - Rodopa Smolyan|Apr 02, 2022
200 L IM|2:39.29||1st Cyprus International Masters Swim Meet|Oct 22, 2016

and I want to separate the data by "|" while making sure that each column lines up with it's longest "element" like this:

 Event         Time     Meta  Meet                                                   Date         
 50 Y Free     22.30    U     IL NASA Winter Blast Off                               Nov 30, 2018 
 100 Y Free    55.50    U     Greg's Super Splash                                    Jun 16, 2011 
 50 Y Breast   27.07    X     CCIW Swimming Championships                            Feb 10, 2006 
 50 L Breast   33.70          1st Cyprus International Masters Swim Meet             Oct 22, 2016 
 100 Y Breast  58.97          CCIW Swimming Championships                            Feb 10, 2006 
 100 L Breast  1:14.24        42nd Bulgarian Masters Championship                    Sep 07, 2020 
 200 Y Breast  2:06.45  U     Greg's Super Splash                                    Jun 16, 2011 
 200 L Breast  2:47.48        1st Cyprus International Masters Swim Meet             Oct 22, 2016 
 200 Y IM      2:03.71        CCIW Swimming Championships                            Feb 09, 2006 
 200 S IM      2:39.60        3rd International Masters Tournament - Rodopa Smolyan  Apr 02, 2022 
 200 L IM      2:39.29        1st Cyprus International Masters Swim Meet             Oct 22, 2016

So I've made an algorithm in python that splits the data into rows and then splits that data into columns, finds the longest line for each column and then writes each entry to the output file with spacing as needed:

def beautify_data(swimmer_name):
    with open(f"assets\\data\\{swimmer_name}.txt", "r", encoding="UTF-8") as src:
        data = src.read()

    rows = data.split()
    
    with open(f"assets\\data\\{swimmer_name}.txt", "w", encoding="UTF-8") as dst:
        for row in rows:
            entry = ""
            max_length = 0

            cols = row.split("|")

            for col in cols:
                max_length = col.__len__() if col.__len__() > max_length else max_length

            for col in cols:
                entry  = col.strip()
                
                for i in range(0, max_length):
                    entry  = " "

            dst.write(f"{entry.strip()}\n")

    return f"{swimmer_name}.txt"

For some reason though, the output isn't as expected. I get this:

Event     Time     Meta     Meet     Date
50
Y
Free     22.30     U     IL
NASA
Winter
Blast
Off   Nov
30,
2018
100
Y
Free      55.50      U      Greg's
Super
Splash      Jun
16,
2011
50
Y
Breast      27.07      X      CCIW

etc.

Instead of what I wanted. To me, it seems that each column is being split by whitespace for some reason, which isn't explicit in the code. Conceptually, the code makes sense though. Does anyone know why I don't get the desired output? Thanks!

CodePudding user response：

rows = data.split() splits the entire file into a list of words. str.split() with no argument splits on whitespace. You probably want data = src.readlines() in the line above, which reads the file line by line into a list.

def beautify_data(swimmer_name):
    with open(f"assets\\data\\{swimmer_name}.txt", "r", encoding="UTF-8") as src:
        rows = src.readlines()
    
    with open(f"assets\\data\\{swimmer_name}.txt", "w", encoding="UTF-8") as dst:
        ...