Home > Blockchain >  Return a list after comparing 2 list and returning the count of occurrences in list1 elements that e
Return a list after comparing 2 list and returning the count of occurrences in list1 elements that e

Time:05-06

What if I want to return the list of occurrence of the values in list1 as compared to list2 as below?

list1 = [1, 2, 3, 4, 5]

list2 = [5, 6, 7, 8, 9]

I expect to get 0 occurrence of 1, 0 occurrence of 2, 0 occurrence of 3, 0 occurrence of 4, and 1 occurrence of 5; to have a new list as below,

new_list = [1, 0, 0, 0, 0]

See below by implementation. What am I doing wrong?

from collections import Counter
list1 = [1, 2, 3, 4, 5]
list2 = [5, 6, 7, 8, 9]

def matchingStrings(list1, list2):
    count_all=Counter(list1)
    counts= {x: count_all[x] for x in list2 if x in list1 }
    list_output=list(counts.values())
    print(list_output)
    return list_output
    # Write your code here


if __name__ == '__main__':
    matchingStrings(list1,list2)

Output

Expected output

[1, 0, 0, 0, 0]

CodePudding user response:

Directly pass a set of he needed values to the Counter then iterate on the list1 to get their count or 0

def matchingStrings(list1, list2):
    counts = Counter(list1)
    return [counts.get(value, 0) for value in list2]

print(matchingStrings(list1, list2))  # [1, 0, 0, 0, 0]

Benchmark of Counter vs list.count

from collections import Counter, defaultdict
from datetime import datetime
import numpy as np

def matchingStrings(list1, list2):
    counts = Counter(list1)
    return [counts.get(value, 0) for value in list2]

def matchingStrings2(list1, list2):
    return [list1.count(a) for a in list2]

if __name__ == '__main__':
    nb = 5000
    times = defaultdict(list)
    for i in range(10):
        list1 = list(np.random.randint(0, 100, nb))
        list2 = list(np.random.randint(0, 100, nb))

        s = datetime.now()
        x1 = matchingStrings(list1, list2)
        times["counter"].append(datetime.now() - s)

        s = datetime.now()
        x2 = matchingStrings2(list1, list2)
        times["list"].append(datetime.now() - s)

    print(np.mean(times['list']) / np.mean(times['counter']))

    for key, values in times.items():
        print(f"{key:7s} => {np.mean(values)}")

Counter is about 500 times faster

481.512173128945
counter => 0:00:00.003327
list    => 0:00:01.601991

CodePudding user response:

A small correction fixes that:

from collections import Counter
list1 = [1, 2, 3, 4, 5]
list2 = [5, 6, 7, 8, 9]

def matchingStrings(list1, list2):
    count_all=Counter(list2)
    counts= {x: count_all[x] for x in list1 }
    list_output=list(counts.values())
    print(list_output)
    return list_output
    # Write your code here


if __name__ == '__main__':
    matchingStrings(list1,list2)

CodePudding user response:

Try using the count method.

list1 = [1, 2, 3, 4, 5]

list2 = [5, 6, 7, 8, 9]
def matchingStrings(list1, list2):
    new_list = [list1.count(a) for a in list2]
    print(new_list)
    return new_list


if __name__ == '__main__':
    matchingStrings(list1,list2)

OUTPUT:

[1, 0, 0, 0, 0]
  • Related