Pandas how to count the string appear times in columns?-CodePudding

It would be more easy to explain start from a simple example df:

df1:

  A B C D
0 a 6 1 b/5/4
1 a 6 1 a/1/6
2 c 9 3 9/c/3

There were four columns in the df1(ABCD).The task is to find out columns D's strings appeared how many times in columnsABC(3coulumns)?Here is expect output and more explanation:

df2(expect output):

        A B C D     E (New column)
      0 a 6 1 b/5/4 0 <--Found 0 ColumnD's Strings from ColumnABC
      1 a 6 1 a/1/6 3 <--Found a、1 & 6 so it should return 3
      2 c 9 3 9/c/3 3 <--Found all strings (3 totally)

Anyone has good idea for this? Thanks!

CodePudding user response：

You can use a list comprehension with set operations:

df['E'] = [len(set(l).intersection(s.split('/'))) for l, s in
           zip(df.drop(columns='D').astype(str).to_numpy().tolist(),
               df['D'])]

Output:

   A  B  C      D  E
0  a  6  1  b/5/4  0
1  a  6  1  a/1/6  3
2  c  9  3  9/c/3  3

CodePudding user response：

import pandas as pd
from pandas import DataFrame as df

dt = {'A':['a','a','c'], 'B': [6,6,9], 'C': [1,1,3], 'D':['b/5/4', 'a/1/6',  'c/9/3']}
E = []

nu_data =pd.DataFrame(data=dt)

for itxid, itx in enumerate(nu_data['D']):
    match = 0
    str_list = itx.split('/')
    for keyid, keys in enumerate(dt):
        if keyid < len(dt)-1:
            for seg_str in str_list:
                if str(dt[keys][itxid]) == seg_str:
                    match  = 1
    E.append(match)

nu_data['E'] = E
print(nu_data)