Home > database >  Python pandas count exact match columns with index column
Python pandas count exact match columns with index column

Time:04-27

I have a dataframe with 0s and 1s

a   1 1 1 1 0 0 0 1 0 0 0 0 0
b   1 1 1 1 0 0 0 1 1 0 0 0 0
c   1 1 1 1 0 0 0 1 1 1 1 0 0
d   1 1 1 1 0 0 0 1 1 1 1 0 0
e   1 1 1 1 0 0 0 0 0 0 0 1 1
f   1 1 1 1 1 1 1 0 0 0 0 0 0

(No header)

I want to make a function that if a certain list with strings given (row name),

the output will be the number of columns exactly matched with strings

For example,

def exact_match(ls1):
  ~~~~~
  return col_num

print(exact_match(['c', 'd']))
>>> 2

The output is 2 because

enter image description here

The exact matching columns are only two.

CodePudding user response:

The question is unclear, but if you want to get the columns for which there is only 1s in the provided indices and not in the other rows, you can use:

def exact_match(ls1):
    # 1s on the provided indices
    m1 = df.loc[ls1].eq(1).all()
    # no 1s in the other rows
    m2 = df.drop(ls1).ne(1).all()
    # slice and get shape
    return df.loc[:, m1&m2].shape[1]
    # or
    # return (m1&m2).sum()

print(exact_match(['c', 'd']))
# 2

CodePudding user response:

If I understood your mean, correctly

and, your dataframe was something like:

df = pd.DataFrame(data = [
    ["a", 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0],
    ["b", 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0],
    ["c", 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0],
    ["d", 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0],
    ["e", 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1],
    ["f", 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
])
df = df.rename(columns = {0:"name"}).set_index("name")

then:

def exact_match(lst):
    s = df[df.columns[df.loc[lst].sum(axis = 0) == len(lst)]].sum(axis = 0) == len(lst)
    return len(s[s])
exact_match(["c","d"]) # output: 2
  • Related