I have a streamlit multi-select for different customers, of which I would like only get products which are common to the selected customers.

My pandas dataframe:

customer  product version
A         iphone  11
A         ipad    7
B         iphone  11
B         ipad    7
B         iMac    2012

I would like to get a list of only the products which exists for bot Customer A and B, to be used in another select.

Expected Output:
['iphone', 'ipad']

Any ideas?

CodePudding user response：

I would first get a list of the products associated to A, and the ones associated to B:

product_A = df.loc[df['customer'] == 'A', 'product'].to_list()
product_B = df.loc[df['customer'] == 'B', 'product'].to_list()

Gives:

product_A
['iphone', 'ipad']
product_B
['iphone', 'ipad', 'iMac']

Then get a list of all the possible products (with duplicates removed):

products = df['product'].drop_duplicates().to_list()
['iphone', 'ipad', 'iMac']

You can then use list comprehension to go through each item in products to check if it is both in product_A and product_B:

[i for i in products if (i in product_A) and (i in product_B)]
['iphone', 'ipad']

CodePudding user response：

Here is one approach to address the issue.

Code

import streamlit as st
import pandas as pd

data = {
    'customer': ['A', 'A', 'B', 'B', 'B', 'C'],
    'product': ['iphone', 'ipad', 'iphone', 'ipad', 'iMac', 'ipad'],
    'version': ['11', '7', '11', '7', '2012', '7']
}

df = pd.DataFrame(data)

st.write('### Initial df')
st.write(df)

customer_sel = st.multiselect('select customer', df.customer.unique())

if customer_sel:
    # Build a dataframe based on the selected customers.
    dflist = []
    for c in customer_sel:
        dfp = df.loc[(df.customer == c)]
        dflist.append(dfp)

    if dflist:
        st.write('### products from selected customers')
        dfsel = pd.concat(dflist)
        st.write(dfsel)

        # Create a dict where product is the key. A product with more than 1
        # value will be our common product.
        keys = dfsel["product"].unique()
        common = {key: 0 for key in keys}
        # print(common)

        # Group the customer and product.
        gb = dfsel.groupby(["customer", "product"])

        # The name is a tuple of unique (customer, product) from groupby gb.
        for name, _ in gb:
            p = name[1]  # get the product only

            # Count the product from groupby.
            common[p]  = 1 if common.get(p, None) is not None else 0

        # A product is common when it is selected at least once by each customer.
        num_customer_sel = len(customer_sel)
        common_product = [k for k, v in common.items() if v >= num_customer_sel]

        if common_product:
            st.write('### Common products from selected customers')
            st.write(str(common_product))

Code

Output 1

Output 2