i wanted to assign a value on Pandas dataframe based on a condition of the length of the value..
So i wanted to create a new column, that new column value is assigned based on the condition, if the value length is > 8. and another new column is assigned with value =< 8
I'm trying this :
So the df_dr_dp_lj['kode_wilayah']
is the existing columns that holds the value, and i wanted to check the length of it.
And the columns df_dr_dp_lj['kode_kelurahan']
and that column will hold the value with length more than 8. And this column df_dr_dp_lj['kode_kecamatan']
will hold the value with length =<8
My code is looking like this :
if df_dr_dp_lj['kode_wilayah'].str.len() > 8:
df_dr_dp_lj['kode_kelurahan']=df_dr_dp_lj['kode_wilayah']
else :
df_dr_dp_lj['kode_kecamatan']=df_dr_dp_lj['kode_wilayah']
an error i got is :
Input In [47], in <cell line: 1>()
----> 1 if df_dr_dp_lj['kode_wilayah'].str.len() > 8:
2 df_dr_dp_lj['kode_kelurahan']=df_dr_dp_lj['kode_wilayah']
3 else :
File D:\Python\lib\site-packages\pandas\core\generic.py:1527, in NDFrame.__nonzero__(self)
1525 @final
1526 def __nonzero__(self):
-> 1527 raise ValueError(
1528 f"The truth value of a {type(self).__name__} is ambiguous. "
1529 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
1530 )
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
#update
I tried it with the suggested answer, but still got data length for more than 8 of length.
CodePudding user response:
here is one way to do it, using mask
in your code example, you're using a for loop and doing a comparison against a series, hence the error
df_dr_dp_lj['kode_kelurahan']= df_dr_dp_lj['kode_wilayah'].mask(
df_dr_dp_lj['kode_wilayah'].str.len()> 8,
df_dr_dp_lj['kode_wilayah'] )
df_dr_dp_lj['kode_kecamatan']= df_dr_dp_lj['kode_wilayah'].mask(
df_dr_dp_lj['kode_wilayah'].str.len()<=8,
df_dr_dp_lj['kode_wilayah'] )