What does this function do? I am getting an AttributeError-CodePudding

I am trying to clean a dataframe and the code works on the entire dtaframe however when I am testing the snippet, I am getting an error:

My dataframe:

df:

         Unnamed: 0                                               game  score home_odds draw_odds away_odds                  country                            league            datetime                     home_team                      away_team home_score away_score
1114366     1114366                               Estrella - Britannia    1:7      2.67      3.87      2.14                    Aruba                 Division di Honor 2021-11-01 00:00:00                      Estrella                      Britannia          1          7
1114367     1114367                                  Aucas - LDU Quito    1:0      2.75      3.44      2.36                  Ecuador                          Liga Pro 2021-11-01 00:00:00                         Aucas                      LDU Quito          1          0
1114368     1114368                          Ocotal - Juventus Managua    1:0      2.49      3.12       2.6                Nicaragua                      Liga Primera 2021-11-01 00:00:00                        Ocotal               Juventus Managua          1          0
1114369     1114369                               Jalapa - Real Madriz    0:1      2.29      3.15      2.82                Nicaragua                      Liga Primera 2021-11-01 00:00:00                        Jalapa                    Real Madriz          0          1
1114370     1114370                         Sporting San Jose - Grecia    2:1      2.28      3.29      2.94               Costa Rica                  Primera Division 2021-11-01 01:00:00             Sporting San Jose

The I am getting an error when I run this:

m = df[['home_odds', 'draw_odds', 'away_odds']].agg(lambda x: x.str.count('/'), 1).ne(0).all(1)

What action does the function perform?

Error:

File "C:/Users/harsh/AppData/Roaming/JetBrains/PyCharmCE2021.2/scratches/scratch_2.py", line 35, in <module>
    m = df[['home_odds', 'draw_odds', 'away_odds']].agg(lambda x: x.str.count('/'), 1).ne(0).all(1)
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\frame.py", line 8551, in aggregate
    result = op.agg()
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\apply.py", line 715, in agg
    result = self.obj.apply(self.orig_f, axis, args=self.args, **self.kwargs)
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\frame.py", line 8741, in apply
    return op.apply()
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\apply.py", line 688, in apply
    return self.apply_standard()
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\apply.py", line 812, in apply_standard
    results, res_index = self.apply_series_generator()
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\apply.py", line 828, in apply_series_generator
    results[i] = self.f(v)
  File "C:/Users/harsh/AppData/Roaming/JetBrains/PyCharmCE2021.2/scratches/scratch_2.py", line 35, in <lambda>
    m = df[['home_odds', 'draw_odds', 'away_odds']].agg(lambda x: x.str.count('/'), 1).ne(0).all(1)
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\generic.py", line 5487, in __getattr__
    return object.__getattribute__(self, name)
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\accessor.py", line 181, in __get__
    accessor_obj = self._accessor(obj)
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\strings\accessor.py", line 168, in __init__
    self._inferred_dtype = self._validate(data)
  File "C:\Users\harsh\OneDrive\Documents\VENV\lib\site-packages\pandas\core\strings\accessor.py", line 225, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!

CodePudding user response：

Error is expected, because function count / so need strings in columns 'home_odds', 'draw_odds', 'away_odds', but there are numbers, so raise error.

All together:

df[['home_odds', 'draw_odds', 'away_odds']].agg(lambda x: x.str.count('/'), 1).ne(0).all(1)

x.str.count('/') # count `/`

.agg(lambda x: x.str.count('/'), 1) # count per rows in specified columns

.ne(0) #test if not equal 0, means no / in specified columns

.all(1) # test if all values are Trues per rows, same like .all(axis=1)

If possible mixed strings with numbers add astype(str), if return all Trues it means there is no values with /:

m = df[['home_odds', 'draw_odds', 'away_odds']].astype(str).agg(lambda x: x.str.count('/'), 1).ne(0).all(1)

Test if some Falses - get rows with at least one / per specified columns:

print (df.loc[~m, ['home_odds', 'draw_odds', 'away_odds']])