Python program to number rows-CodePudding

i have a file with data as such.

>1_DL_2021.1123
>2_DL_2021.1206
>3_DL_2021.1202
>3_DL_2021.1214
>4_DL_2021.1214
>4_DL_2021.1214
>6_DL_2021.1214
>7_DL_2021.1214
>8_DL_2021.1214

now as you can see the data is not numbered properly and hence needs to be numbered. what im aiming for is this:

>1_DL_2021.1123
>2_DL_2021.1206
>3_DL_2021.1202
>4_DL_2021.1214
>5_DL_2021.1214
>6_DL_2021.1214
>7_DL_2021.1214
>8_DL_2021.1214
>9_DL_2021.1214

now the file has a lot of other stuff between these lines starting with > sign. i want only the > sign stuff affected.

could someone please help me out with this. also there are 563 such lines so manually doing it is out of question.

CodePudding user response：

So, assuming input data file is "input.txt"

You can achieve what you want with this

import re

with open("input.txt", "r") as f:
    a = f.readlines()

regex = re.compile(r"^>\d _DL_2021\.\d \n$")

counter = 1

for i, line in enumerate(a):
    if regex.match(line):
        tokens = line.split("_")
        tokens[0] = f">{counter}"
        a[i] = "_".join(tokens)

        counter  = 1

with open("input.txt", "w") as f:
    f.writelines(a)

So what it does it searches for line with the regex ^>\d _DL_2021\.\d \n$, then splits it by _ and gets the first (0th) element and rewrites it, then counts up by 1 and continues the same thing, after all it just writes updated strings back to "input.txt"

CodePudding user response：

sudden_appearance already provided a good answer. In case you don't like regex too much you can use this code instead:

new_lines = []
with open('test_file.txt', 'r') as f:
    c = 1
    for line in f:
        if line[0] == '>':
            after_dash = line.split('_',1)[1]
            new_line = '>'   str(c)   '_'   after_dash
            c  = 1
            new_lines.append(new_line)
        else:
            new_lines.append(line)
        
with open('test_file.txt', 'w') as f:
    f.writelines(new_lines)

Also you can have a look at this split tutorial for more information about how to use split.