Create 2 random lists based on 1 list?-CodePudding

I have a global_list: [1,7,23,72,7,83,60] I need to create to 2 randoms lists from this one (example):

list_1: [7, 7, 23, 83, 72]
list_2: [1, 60]

My actual working code is:

import random
import copy

def get_2_random_list(global_list, repartition):
    list_1, list_2 = [], copy.copy(global_list)
    list_1_len = round(len(list_2)*repartition)
    print("nbr of element in list_1:",list_1_len)
    for _ in range(list_1_len):
        index = random.randint(1,len(list_2)-1)
        list_1.append(list_2[index])
        del list_2[index]
    return list_1, list_2


global_list = [1,7,23,72,7,83,60]
list_1,list_2 = get_2_random_list(global_list,0.7)
print("list_1:", list_1)
print("list_2:", list_2)
print("global_list:", global_list)

I feel like it could be optimized. (maybe I didn't searched enough on the random library) in term of efficicency (I'm working on millions of elements) and in terms of density (I would prefer to have 1 or 2 lines of code for the function).

CodePudding user response：

How about:

def get_2_random_list(global_list, repartition):
    g_list = list(global_list)
    random.shuffle(g_list)
    split_point = round(len(g_list)*repartition)
    
    return g_list[:split_point], g_list[split_point:]

CodePudding user response：

Try with numpy

l = [1,7,23,72,7,83,60]
l2 = l.copy()
# randomly select a number based on the len of the list
split_num = np.random.choice(len(l), 1)
# create a new list by using random choice without replacement
l1 = list(np.random.choice(l, split_num, replace=False))
# remove the numbers in l1 from the original list
[l2.remove(x) for x in l1]
# print your two new lists
print(l1)
print(l2)
print(l)

[60, 83, 23, 72]
[1, 7, 7]
[1, 7, 23, 72, 7, 83, 60]

CodePudding user response：

maybe I didn't searched enough on the random library

You definitely didn't search enough if you didn't find random.sample

def get_2_random_list(global_list, repartition):
    list_1_len = int(round(len(global_list)*repartition))
    list1 = random.sample(global_list, list_1_len)
    set1 = set(list1)
    list2 = [element for element in global_list if element not in set1]
    return list1, list2

EDIT: In case your list items are not unique, this would be a better approach to generate list2:

def get_2_random_list(global_list, repartition):
    global_len = len(global_list)
    list_1_len = int(round(global_len*repartition))
    list1_indices = random.sample(range(global_len), list_1_len)
    list1 = [global_list[idx] for idx in list1_indices]
    set1 = set(list1_indices)
    list2 = [element for idx, element in enumerate(global_list)
             if idx not in set1]
    return list1, list2

Addendum: And if you are not interested in the order of elements in list2, its even quicker to use random.shuffle.

def get_2_random_list(global_list, repartition):
    list_1_len = int(round(len(global_list)*repartition))
    lstcopy = list(global_list)
    random.shuffle(lstcopy)
    list1 = lstcopy[:list_1_len]
    list2 = lstcopy[list_1_len:]
    return list1, list2

CodePudding user response：

numpy has a function shuffle may help:

arr = np.array([1,7,23,72,7,83,60])
np.random.shuffle(arr)
p = 5
a1 = arr[:p]
a2 = arr[p:]
print(a1)
print(a2)

CodePudding user response：

I would use the sample method from the random library. It is similar to the choices method but no duplicates allowed. Finally I added every item that is not in list_1 to list_2.

def get_2_random_list(global_list, repartition):
    list_1_len = round(len(global_list) * repartition)

    list_1 = random.sample(global_list, list_1_len)
    list_2 = [i for i in global_list if i not in (list_1)]

    return list_1, list_2

CodePudding user response：

If your repartition is a probability (as opposed to a proportion of items), you'll need to use it as a threshold to randomly place items in the firt or second list:

import random

L = [1,7,23,72,7,83,60]

repartition = 0.7 # as a probability
L1,L2 = [],[]
for n in L:
    (L1,L2)[random.random()>repartition].append(n)
    
print(L1)  # [23, 72, 60]
print(L2)  # [1, 7, 7, 83]

This means that items have 70% chance of being placed in the 1st list and 30% chance of going to the second one. With short lists, this will often not even be close to a 70/30 proportion of items.

If your repartition expresses a proportion in terms of number of items, you can use random.sample to avoid modifying the original list:

from random import sample

L = [1,7,23,72,7,83,60]

repartition = 0.7 # as a proportion of items

L1,L2 = (lambda R,p:(R[:p],R[p:]))(sample(L,len(L)),round(repartition*len(L)))
    
print(L1) # [7, 83, 7, 72, 60]
print(L2) # [23, 1]