Using Python3, I'd like to distribute files onto two hard drives depending on their filename.
/mnt/disk1/
/mnt/disk2/
All filenames are case sensitive 16-character ascii strings (e.g. I38A2NPp0OeyMiw9.jpg).
Based on a filename, how can I evenly split the path to /mnt/disk1 or /mnt/disk2? Ideally I'd like to be able to use N file paths.
CodePudding user response:
Function to map a string (the filename) to an integer between 1 and n
:
def map_dir(s, n=2):
import hashlib
m = hashlib.sha256(s.encode('utf-8'))
return int(m.hexdigest(), 16)%n 1
Example:
>>> map_dir('example.txt')
1
>>> map_dir('file.csv')
2
Checking that it works on 100k random strings and 10 buckets:
import random, string
def randfname(N=8):
return ''.join(random.choices(string.ascii_uppercase string.digits, k=N))
from collections import Counter
Counter((map_dir(randfname(), n=10) for i in range(100000)))
output:
Counter({9: 9994,
2: 10091,
10: 10078,
4: 10014,
3: 9897,
6: 10143,
8: 10021,
7: 9891,
1: 9919,
5: 9952})
~ 10k each, it works!