Consider a data frame with 97 rows and 44 columns where i have three columns whose names are "Bostwick","mu_yield" , so i'm trying to create a new column called "Target" where if the "Bostwick" column values lie between "5.00 and 6.75" else if "mu_yield" column values lie between "89.00 and 90.00" , the "Target" column values should be 0 else it is 1
I tried the below way
bos['Target'] = np.where(((bos['mu_yield'] < 5.000) | (bos['mu_yield'] > 6.750)), 0,
np.where((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)))
There were no errors but the entire "Target" column values are 0
Hence i tried the below method
bos['Target'] = np.where((bos['Bostwick'] < 5.000) | (bos['Bostwick'] > 6.750)) or ((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)
Here i'm facing the below value error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_35620/4282921525.py in <module>
----> 1 bos['Target'] = np.where((bos['Bostwick'] < 5.000) | (bos['Bostwick'] > 6.750)) or ((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)
~\anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
3610 else:
3611 # set column
-> 3612 self._set_item(key, value)
3613
3614 def _setitem_slice(self, key: slice, value):
~\anaconda3\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)
3782 ensure homogeneity.
3783 """
-> 3784 value = self._sanitize_column(value)
3785
3786 if (
~\anaconda3\lib\site-packages\pandas\core\frame.py in _sanitize_column(self, value)
4507
4508 if is_list_like(value):
-> 4509 com.require_length_match(value, self.index)
4510 return sanitize_array(value, self.index, copy=True, allow_2d=True)
4511
~\anaconda3\lib\site-packages\pandas\core\common.py in require_length_match(data, index)
529 """
530 if len(data) != len(index):
--> 531 raise ValueError(
532 "Length of values "
533 f"({len(data)}) "
ValueError: Length of values (1) does not match length of index (94)
Requesting someone to help me on the same
CodePudding user response:
Use |
for bitwise OR and in original use &
for bitwise AND
:
bos['Target'] = np.where(((bos['Bostwick'] > 5.000) & (bos['Bostwick'] < 6.750)) |
((bos['mu_yield'] > 89.00) & (bos['mu_yield'] < 90.00)), 0, 1)
Alternative with Series.between
:
bos['Target1'] = np.where(bos['Bostwick'].between(5.000, 6.750, inclusive=False) |
bos['mu_yield'].between(89.000, 90.00, inclusive=False), 0, 1)