lambda syntax, not sure why I am getting AttributeError: 'str' object has no attribute &#0-CodePudding

I have a dataframe called df_freight and I would like to create a new column called "LM", based on a condition in another column called "Cost rate". The condition is: if it contains code "lm" right "lm" otherwise "not lm".

df_freight =pd.DataFrame(
     {'Cost rate': ['11.53 LM', '12.22kg','22 LM','sdfdfsdf'],
     'TO Number': ['x12', 'x13','x14','x15']})


df_fright["LM"] = df_fright.apply(lambda row: "LM" if row["Cost rate"].str.contans("lm") else "Not lm", axis=1)

but I am getting attribute error

   ---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-94-d277ddb08fc7> in <module>
----> 1 df_fright["LM"] = df_fright.apply(lambda row: "LM" if row["Cost rate"].str.contans("lm") else "Not lm", axis=1)

~\Anaconda3\envs\general\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   7766             kwds=kwds,
   7767         )
-> 7768         return op.get_result()
   7769 
   7770     def applymap(self, func, na_action: Optional[str] = None) -> DataFrame:

~\Anaconda3\envs\general\lib\site-packages\pandas\core\apply.py in get_result(self)
    183             return self.apply_raw()
    184 
--> 185         return self.apply_standard()
    186 
    187     def apply_empty_result(self):

~\Anaconda3\envs\general\lib\site-packages\pandas\core\apply.py in apply_standard(self)
    274 
    275     def apply_standard(self):
--> 276         results, res_index = self.apply_series_generator()
    277 
    278         # wrap results

~\Anaconda3\envs\general\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
    288             for i, v in enumerate(series_gen):
    289                 # ignore SettingWithCopy here in case the user mutates
--> 290                 results[i] = self.f(v)
    291                 if isinstance(results[i], ABCSeries):
    292                     # If we have a view on v, we need to make a copy because

<ipython-input-94-d277ddb08fc7> in <lambda>(row)
----> 1 df_fright["LM"] = df_fright.apply(lambda row: "LM" if row["Cost rate"].str.contans("lm") else "Not lm", axis=1)

AttributeError: 'str' object has no attribute 'str'

isn't the syntax correct?

CodePudding user response：

row["Cost rate"] is already a string, so you don't have to use .str. Also to check if a substring is contained in a string use in instead of contains().

import pandas as pd

df_freight = pd.DataFrame(
    {'Cost rate': ['11.53 LM', '12.22kg', '22 LM', 'sdfdfsdf'],
     'TO Number': ['x12', 'x13', 'x14', 'x15']})

df_freight["LM"] = df_freight.apply(lambda row: "LM" if "lm" in row["Cost rate"] else "Not lm", axis=1)

print(df_freight)
>   Cost rate TO Number      LM
  0  11.53 LM       x12  Not lm
  1   12.22kg       x13  Not lm
  2     22 LM       x14  Not lm
  3  sdfdfsdf       x15  Not lm

Returns False for all because comparing strings is case-sensitive. So you have to add .lower() to compare them:

import pandas as pd

df_freight = pd.DataFrame(
    {'Cost rate': ['11.53 LM', '12.22kg', '22 LM', 'sdfdfsdf'],
     'TO Number': ['x12', 'x13', 'x14', 'x15']})

df_freight["LM"] = df_freight.apply(lambda row: "LM" if "lm" in row["Cost rate"].lower() else "Not lm", axis=1)

print(df_freight)
>   Cost rate TO Number      LM
  0  11.53 LM       x12      LM
  1   12.22kg       x13  Not lm
  2     22 LM       x14      LM
  3  sdfdfsdf       x15  Not lm

CodePudding user response：

You can use a vectorized operation:

df_freight["LM"] = np.where(df_freight['Cost rate'].str.contains('lm', case=False),
                            'LM', 'Not lm')
print(df)

# Output
  Cost rate TO Number      LM
0  11.53 LM       x12      LM
1   12.22kg       x13  Not lm
2     22 LM       x14      LM
3  sdfdfsdf       x15  Not lm

CodePudding user response：

Use this code instead. This will work also this is easiest one mentioned here. No need for str, no need of axis, no need for vectorize, no need for anything simple and easy. enjoy!

df_freight['LM'] = df_freight['Cost rate'].apply(lambda x: 'LM' if "LM" in x else "Not lm")

Output

    Cost rate   TO Number   LM
0   11.53 LM         x12    LM
1   12.22kg          x13    Not lm
2   22 LM            x14    LM
3   sdfdfsdf         x15    Not lm