Home > Software design >  Pandas: Store DataFrame stats in dict
Pandas: Store DataFrame stats in dict

Time:09-22

I need count and store the number of occurrences of a pattern in a dataframe as a dict.

df = pd.DataFrame([[1000, 'Jerry', 'BR1','BR1','N/A'], 
                [1001, 'N/A', 'N/A', 'BR1','N/A'], 
                ['N/A', 'N/A', 'BR3', 'BR2','N/A'],
                [1003, 'Perry','BR4','BR1','N/A']],
               columns=['ID', 'Name', 'Branch', 'Member of','Status'])


for index, row in df.iterrows():
    new_dict = {'rows': len(df.index),
                'col1': df[df.columns[0]].count()}
    
    print(new_dict)

Is there a way to add a dictionary entry that counts the occurrences of the pattern 'N/A' as well as non-occurrences?

Something like:

for index, row in df.iterrows():
    new_dict = {'rows': len(df.index),
                'col1': df[df.columns[0]].count(),
                '# of NA': df[df.columns[0]] == 'N/A',
                '# NOT NA': df[df.columns[0]] != 'N/A'}
    
    print(new_dict)

CodePudding user response:

You may can just try info

df.replace('N/A',np.nan).info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 5 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   ID         3 non-null      float64
 1   Name       2 non-null      object 
 2   Branch     3 non-null      object 
 3   Member of  4 non-null      object 
 4   Status     0 non-null      float64
dtypes: float64(2), object(3)
memory usage: 288.0  bytes
  • Related