I wanted to find a way to sort strings that have numbers in them by their numerical size.
I found one way to sort strings that contain only numbers, which works well (Sorting numbers in string format with Python) but not when the string is a mix of words and numbers.
In this example I am creating the list in the order that I want, but the sorted() ruins it.
Example:
>>> s = ['Castle_Wall_25x400x100_Bottom_01', 'Castle_Wall_25x400x50_Top_02',
'Castle_Wall_25x400x10_Bottom_01', 'Castle_Wall_25x400x300_Top_01']
>>> print(sorted(s))
['Castle_Wall_25x400x100_Bottom_01', 'Castle_Wall_25x400x10_Bottom_01', 'Castle_Wall_25x400x300_Top_01', 'Castle_Wall_25x400x50_Top_02']
Expected output:
['Castle_Wall_25x400x10_Bottom_01', 'Castle_Wall_25x400x50_Top_02', 'Castle_Wall_25x400x100_Bottom_01', 'Castle_Wall_25x400x300_Top_01']
Edit: Solution!
I solved it by creating a copy of the list where all numbers are padded with zeros so they are all of equal length, then I sort the original list using this new proxy list:
import re
names = ["Castle_Wall_25x400x10_Bottom_01", "Castle_Wall_25x400x50_Top_02", "Castle_Wall_25x400x100_Bottom_01", "Castle_Wall_25x400x300_Top_01"]
padded = []
longest = 0
for n in names:
digits = re.findall('\d ', n)
for digit in digits:
if len(digit) > longest:
longest = len(digit)
for name in names:
digits = re.findall('\d ', name)
split = re.split('\d ', name)
padded_name = ''
for i, s in enumerate(split):
padded_name = s
if i < len(digits):
padded_name = digits[i].zfill(longest)
padded.append(padded_name)
sorted_list = [x for _, x in sorted(zip(padded, names))]
for name in sorted_list:
print(name)
CodePudding user response:
IIUC, you are trying to multiply the numbers in 10x05
- which you can do by passing a key
function to sorted
def eval_result(s):
prefix, op = s.split('_')
num1, num2 = map(int, op.split('x'))
return num1 * num2
sorted(s, key=eval_result)
Output
['A_10x05', 'A_10x50', 'A_10x100']
CodePudding user response:
I believe what you want is just to sort each part of the input strings separately - text parts alphabetically, numeric parts by numeric value, with no multiplications involved. If this is the case you will need a helper function:
from re import findall
s = ['A_10x5', 'Item_A_10x05x200_Base_01', 'A_10x100', 'B']
def fun(s):
f = findall(r'\d |[A-Za-z_] ',s)
return list(map(lambda x:int(x) if x.isdigit() else x, f))
sorted(s, key = fun)
['A_10x5', 'A_10x100', 'B', 'Item_A_10x05x200_Base_01']
CodePudding user response:
Providing each string in the list contains exactly three dimensions:
import re
from functools import cache
s = ['Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02',
'Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x300_Top_01']
@cache
def get_size(s):
if len(tokens := s.split('x')) != 3:
return 0
first = re.findall('(\d )', tokens[0])[-1]
last = re.findall('(\d )', tokens[-1])[0]
return int(first) * int(tokens[1]) * int(last)
print(sorted(s, key=get_size))
Output:
['Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02', 'Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x300_Top_01']