Print the duplicate values from a dictionary in Python-CodePudding

I just wonder if it's possible to print the duplicate values from a dictionary.

For exemple I have this dictonary:

responses={
    'greet':'Hello! How can I help you?',
    'types':'Our coffee types are: light roasted, medium roasted, medium dark roasted, dark roasted.',
    'light':'Coffee Bros Paraideli Cup Of Excellence, Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee.',
    'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',
    'dark':'Koa Coffee Estate, Atlas Coffee Club, Lifeboost Coffee Organic Dark Roast, Peets House Blend, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'fruit':'Frutis notes coffee: Coffee Bros Paraideli Cup Of Excellence, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee, Volcanica Coffee Kenya AA, Peets House Blend.',
    'vanilla':'Vanilla notes coffee: Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza.',
    'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'timings':'We are open from 9AM to 5PM, Monday to Friday. We are closed on weekends and public holidays.',
    'fallback':'I dont quite understand. Could you repeat that?',
}

So if I pick two different keys like:

'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',

They have a common coffee name like:

Purity Coffee Flow, Out Of The Grey Costa Rica La Minita

So if I insert the keys for it like: chocolate, medium.

The program need to print only those two duplicates:

Purity Coffee Flow, Out Of The Grey Costa Rica La Minita

It's possible to print just those 2 words in console which are duplicates in there?

The only thing that I manage to work is to print the duplicates values if values are completly the same, but that's not my use case.

CodePudding user response：

a = 'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.'

b = 'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.'

a_list = a.split(",")
b_list = b.split(",")

a_set = set(a_list)
b_set = set(b_list)


print(a_set.intersection(b_set))

How about this?

CodePudding user response：

This code first calculates all the pairs of keys you have in your dict through itertool.product which is the cartesian product of two lists.

Then uses a function to find the common elements in two lists. The idea is the following

    common = [ k for k in list1 if k in list2]

However, instead of list1 and list2 we can use the values in your dict[key]. I noticed that they are strings, so I used the split method to split where commas are.

from itertools import product

responses={
    'greet':'Hello! How can I help you?',
    'types':'Our coffee types are: light roasted, medium roasted, medium dark roasted, dark roasted.',
    'light':'Coffee Bros Paraideli Cup Of Excellence, Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee.',
    'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',
    'dark':'Koa Coffee Estate, Atlas Coffee Club, Lifeboost Coffee Organic Dark Roast, Peets House Blend, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'fruit':'Frutis notes coffee: Coffee Bros Paraideli Cup Of Excellence, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee, Volcanica Coffee Kenya AA, Peets House Blend.',
    'vanilla':'Vanilla notes coffee: Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza.',
    'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'timings':'We are open from 9AM to 5PM, Monday to Friday. We are closed on weekends and public holidays.',
    'fallback':'I dont quite understand. Could you repeat that?',
}

def find_common_values(d,key1,key2):
    common = [ k for k in d[key1].split(",") if k in d[key2].split(",")]
    return common

# get all pairs of keys
keys = responses.keys()
pairs = list(product(keys,keys))

for P in pairs: 
    if P[0] != P[1]:
        comm = find_common_values(responses, P[0] , P[1] )
        if len(comm) != 0:
            print( P , comm )

Which gives:

('light', 'fruit') [' Peets Coffee Costa Rica Aurora']
('medium', 'chocolate') [' Purity Coffee Flow', ' Out Of The Grey Costa Rica La Minita']
('dark', 'chocolate') [' Lifeboost Coffee Organic Dark Roast', ' Coffee Bros Dark Roast', ' Death Wish Coffee', ' Kicking Horse Coffee Grizzly Claw.']
('fruit', 'light') [' Peets Coffee Costa Rica Aurora']
('chocolate', 'medium') [' Purity Coffee Flow', ' Out Of The Grey Costa Rica La Minita']
('chocolate', 'dark') [' Lifeboost Coffee Organic Dark Roast', ' Coffee Bros Dark Roast', ' Death Wish Coffee', ' Kicking Horse Coffee Grizzly Claw.']

CodePudding user response：

It's possible to split each dict value by punctuation into partial sentences. After which you can iterate through the dict and check if partials have been seen already.

import re
pattern = r'[\w\s] '
partials  = set()
for value in responses.values():
    for partial in re.findall(pattern,value):
        if partial in partials:
            print(partial, partials)
        else:
            partials.add(partial)

Output

 Peets Coffee Costa Rica Aurora
 Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee
 Peets House Blend
 Lifeboost Coffee
 Driftaway Coffee Colombia Antioquia And Burundi Kayanza
 Coffee Bros Decaf
 Purity Coffee Flow
 Out Of The Grey Costa Rica La Minita
 Lifeboost Coffee Organic Dark Roast
 Coffee Bros Dark Roast
 Death Wish Coffee
 Kicking Horse Coffee Grizzly Claw

CodePudding user response：

Using regular expressions and set.intersection:

import re

responses={
    'greet':'Hello! How can I help you?',
    'types':'Our coffee types are: light roasted, medium roasted, medium dark roasted, dark roasted.',
    'light':'Coffee Bros Paraideli Cup Of Excellence, Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee.',
    'medium':'Volcanica Coffee Kenya AA, Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Kicking Horse Three Sisters.',
    'dark':'Koa Coffee Estate, Atlas Coffee Club, Lifeboost Coffee Organic Dark Roast, Peets House Blend, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'fruit':'Frutis notes coffee: Coffee Bros Paraideli Cup Of Excellence, Peets Coffee Costa Rica Aurora, Fresh Roasted Coffee Ethiopian Sidamo Guji Coffee, Volcanica Coffee Kenya AA, Peets House Blend.',
    'vanilla':'Vanilla notes coffee: Lifeboost Coffee, Driftaway Coffee Colombia Antioquia And Burundi Kayanza.',
    'chocolate':'Chocolate notes coffee: Coffee Bros Decaf, Purity Coffee Flow, Out Of The Grey Costa Rica La Minita, Lifeboost Coffee Organic Dark Roast, Coffee Bros Dark Roast, Death Wish Coffee, Kicking Horse Coffee Grizzly Claw.',
    'timings':'We are open from 9AM to 5PM, Monday to Friday. We are closed on weekends and public holidays.',
    'fallback':'I dont quite understand. Could you repeat that?',
}

def split_string(str):
    return [s.strip() for s in re.split(r':|,|\.|!|\?|;', str) if len(s) > 0]

def common_strings(str_a, str_b):
    set_a = set(split_string(str_a))
    set_b = set(split_string(str_b))
    return list(set_a.intersection(set_b))

common_strings(responses['chocolate'], responses['medium'])

# ['Purity Coffee Flow',
# 'Out Of The Grey Costa Rica La Minita',
# 'Coffee Bros Decaf']