Home > Mobile >  Consistently determine a "1" or a "2" based on a random 16-character ASCII strin
Consistently determine a "1" or a "2" based on a random 16-character ASCII strin

Time:11-22

Using Python3, I'd like to distribute files onto two hard drives depending on their filename.

/mnt/disk1/
/mnt/disk2/

All filenames are case sensitive 16-character ascii strings (e.g. I38A2NPp0OeyMiw9.jpg).

Based on a filename, how can I evenly split the path to /mnt/disk1 or /mnt/disk2? Ideally I'd like to be able to use N file paths.

CodePudding user response:

Function to map a string (the filename) to an integer between 1 and n:

def map_dir(s, n=2):
    import hashlib
    m = hashlib.sha256(s.encode('utf-8'))
    return int(m.hexdigest(), 16)%n 1

Example:

>>> map_dir('example.txt')
1

>>> map_dir('file.csv')
2

Checking that it works on 100k random strings and 10 buckets:

import random, string

def randfname(N=8):
    return ''.join(random.choices(string.ascii_uppercase   string.digits, k=N))

from collections import Counter
Counter((map_dir(randfname(), n=10) for i in range(100000)))

output:

Counter({9: 9994,
         2: 10091,
         10: 10078,
         4: 10014,
         3: 9897,
         6: 10143,
         8: 10021,
         7: 9891,
         1: 9919,
         5: 9952})

~ 10k each, it works!

  • Related