I have a list of file names that I need to open, however, the file names contain a hyphen "-" which neither os or Windows 10 naturally recognizes as a negative number. As an example, the list itself gets imported as
['-001.000ps.csv',
'-100.000ps.csv',
'0000.000ps.csv',
'0001.000ps.csv',
'0002.000ps.csv',
'0003.000ps.csv',
'0003.500ps.csv',
]
where -1 preceeds -100. The positions of -1 and -100 needs to be reversed and I need to preserve the leading zeros and "ps.csv" component because the files are named this way.
I attempted some solutions I found on this stackexchange, however, what most people wanted dealt with searching for positive numbers and ordering off of that. For the natsort package, what happens is that -1 and -100 are put at the bottom of the list.
Converting these strings to ints or floats fails, I guess because ps.csv is inside of the element.
I copy pasted the solution from the blogpost referenced here and the same issue occurs. I feel like I'm missing something obvious here, why are the negative numbers not working?
CodePudding user response:
You can sort the list based on the first 6 characters of names:
a = ['-001.000ps.csv',
'-100.000ps.csv',
'0000.000ps.csv',
'0001.000ps.csv',
'0002.000ps.csv',
'0003.000ps.csv',
'0003.500ps.csv',
]
a.sort(key=lambda x: float(x[:6]))
Result:
['-100.000ps.csv',
'-001.000ps.csv',
'0000.000ps.csv',
'0001.000ps.csv',
'0002.000ps.csv',
'0003.000ps.csv',
'0003.500ps.csv'
]
CodePudding user response:
natsort
can be used to sort signed numbers like so:
fnames = [
'-001.000ps.csv',
'-100.000ps.csv',
'0000.000ps.csv',
'0001.000ps.csv',
'0002.000ps.csv',
'0003.000ps.csv',
'0003.500ps.csv',
]
natsort.realsorted(fnames)
natsort.natsorted(fnames, alg=natsort.ns.REAL)
Both of these produce the same output:
['-100.000ps.csv',
'-001.000ps.csv',
'0000.000ps.csv',
'0001.000ps.csv',
'0002.000ps.csv',
'0003.000ps.csv',
'0003.500ps.csv']
CodePudding user response:
Try:
lst = [
"-001.000ps.csv",
"-100.000ps.csv",
"0000.000ps.csv",
"0001.000ps.csv",
"0002.000ps.csv",
"0003.000ps.csv",
"0003.500ps.csv",
]
import re
pat = re.compile(r"^(-?\d \.?\d*)(.*)")
out = sorted(
lst, key=lambda x: (float((v := pat.search(x)).group(1)), v.group(2))
)
print(out)
Prints:
[
"-100.000ps.csv",
"-001.000ps.csv",
"0000.000ps.csv",
"0001.000ps.csv",
"0002.000ps.csv",
"0003.000ps.csv",
"0003.500ps.csv",
]