Home > OS >  Sort a txt file based on numbers
Sort a txt file based on numbers

Time:10-22

I have a txt file of data that looks like this:

@0 #1
@30 #2
@750 #2
@20 #3
@500 #3
@2500 #4
@460 #4
@800 #1
@2200 #1
@290 #2
@4700 #4
@570 #1

How do I sort the file based on the integer between the @ and #?

The output should be:

@0 #1
@20 #3
@30 #2
@290 #2
@460 #4
@500 #3
@570 #1
@750 #2
@800 #1
@2200 #1
@2500 #4
@4700 #4

CodePudding user response:

You just need to read in the text and split it by new lines, then use the sorted function using only the integer part of line as the key.

with open('my_text_file.txt') as textfile:
    lines = textfile.read().split('\n')    # ['@0 #1', '@30 #2', '@750 #2', '@20 #3', '@500 #3', '@2500 #4', '@460 #4', '@800 #1', '@2200 #1', '@290 #2', '@4700 #4', '@570 #1']
    lines = sorted(lines, key=lambda i: int(i[1:i.index('#') -1]))  # ['@0 #1', '@20 #3', '@30 #2', '@290 #2', '@460 #4', '@500 #3', '@570 #1', '@750 #2', '@800 #1', '@2200 #1', '@2500 #4', '@4700 #4']
    txt = '\n'.join(lines)

with open('my_new_text_file.txt', 'wt') as textfile:
    textfile.write(txt)

output

@0 #1
@20 #3
@30 #2
@290 #2
@460 #4
@500 #3
@570 #1
@750 #2
@800 #1
@2200 #1
@2500 #4
@4700 #4

CodePudding user response:

This can be done cleanly with a regular expression

import re

with open("somefile.txt") as file:
    lines = sorted((line.strip() for line in file), 
            key=lambda s: int(re.match(r"@(\d )\s*#", s).group(1)))
print(lines)

This will raise an error if any strings don't match the pattern, which is the right thing to do if the file format is strict. You could instead write a function that checks the regex and returns a default value.

  • Related