Assume that I have a DataFrame. Every column has values which are repeated several times. I want to count the No. of occurrence of unique values (including Nan) in all columns, and save the results in a new DataFrame.
Example of DataFrame:
data = {
'col_A': ['X', 'X', 'Y', 'Z', 'Z'],
'col_B': ['Y', 'Y', np.nan, 'Z', 'Z'],
}
df = pd.DataFrame(data)
And the results I would like to get:
index col_A col_B
X 2 0
Y 1 2
Z 2 2
nan 0 1
I appreciate it if you could guide on this problem.
CodePudding user response:
This should do the trick:
df.apply(lambda x: pd.value_counts(x, dropna=False)).fillna(0).astype(int)
This code perform an value_counts()
on each column and fill nans with 0.
Output:
col_A col_B
X 2 0
Y 1 2
Z 2 2
NaN 0 1