Get amount of unique elements in numpy-CodePudding

I have an array
arr = np.array([[1,1,2], [1,2,3]]).
I want to get amount of unique element for each row and count mean
I can do this np.array([len(np.unique(row)) for row in arr]).mean().
But seems, that it's a slow way. Is there another faster approach?

CodePudding user response：

set(arr.flatten()) will create your desired result. Not sure about how fast it is though.

Output:
{1, 2, 3}

Edit:
You wanted the number of unique elements, so you wrap the whole thing in len()

CodePudding user response：

You can use the following:

import numpy as np

arr = np.array([[1, 1, 2], [1, 2, 3]])


mean = np.apply_along_axis(lambda row: len(set(row)), axis=1, arr=arr).mean()

>> mean = 2.5

CodePudding user response：

Here is a method that, up to my knowledge, is the fastest.

import numpy as np
import pandas as pd

# Number of unique elements row wise then mean
def unique(x):
    df = pd.DataFrame(x.T)
    return df.nunique().mean()

arr = np.array([[1,1,2], [1,2,3]])

print(unique(arr))

Output:

2.5