I have data where there are N users and K possible items. The data is in the form of a dictionary like data[user] = [item1, item2, ...]
. I want to take this dictionary and create an N x K
matrix where the (n,k)
is entry is 1 if user n
has purchased this item and 0 otherwise. Below is sample data.
import random
random.seed(10)
# Users
N = list(range(10))
# Items represented by an integer
K = list(range(1000))
# I have a dict of {user: [item1, item2...itemK]}
# where k differs by user
data = {x:random.sample(K, random.randint(1,50)) for x in N}
# Now I want to create an N x K matrix, where rows are users, columns are items, and the (n,k) entry
# is 1 if user i has item k in list and 0 otherwise.
CodePudding user response:
If I understand your question right, you can convert the list of items each user has to set
and then do a test for each item.
Note: I lowered the number of items to 50
(to represent it better on screen):
import random
random.seed(10)
# Users
N = list(range(10))
# Items represented by an integer
K = list(range(50))
# I have a dict of {user: [item1, item2...itemK]}
# where k differs by user
data = {x: random.sample(K, random.randint(1, 50)) for x in N}
# create matrix:
matrix = []
for v in data.values():
v = set(v)
matrix.append([int(i in v) for i in K])
# print matrix:
for row in matrix:
print(*row)
Prints (each row is different user):
1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 1
1 1 1 0 1 0 0 1 1 0 1 0 1 1 0 1 0 0 0 1 1 0 0 1 0 0 1 1 1 1 1 0 1 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0
0 0 0 1 1 1 1 1 1 1 1 0 1 1 1 0 1 0 0 1 0 1 1 1 1 0 1 1 1 1 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 1
1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 0 1 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0
0 1 1 0 0 0 0 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1 1 0 1 1 0 0 1 0 0 1 1 1 1 0 1 1 1 0 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1 0 1 0 1 1
0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 1 0 0 1
CodePudding user response:
The best possible way includes traversing each user in dictionary and each item the user has at the least.
//Assuming users are also represented by integers
mat = [[0]*N]*K //Matrix initialised to value 0
for ui in data:
for i in data[ui]:
mat[ui][i]=1
If the user can have repeated items, you can try-
mat = [[0]*N]*K
for ui in data:
for i in list(set(data[ui])):
mat[ui][i]=1