I'm working on a dataframe that has 2000 rows but for this purpose I have created this simple data frame in which I want to find all rows containing 3 or less digits in the col2 column. Here is the dataframe:
d = {'col1': [10000, 2000,300,4000,50000], 'col2': [10, 20000, 300, 4000, 100]}
df = pd.DataFrame(data=d)
col1 col2
0 10000 10
1 2000 20000
2 300 300
3 4000 4000
4 50000 100
Area int64
Price int64
dtype: object
After that I would like to create a new column col3 where the values from col2 column from those filtered rows (with 3 or less digits) will be multiplied by their values from the col1 column while the other rows stays the same.
Here's the expected output:
col1 col2 col3
0 10000 10 100000
1 2000 20000 20000
2 300 300 90000
3 4000 4000 4000
4 5000 100 500000
col1 int64
col2 int64
col3 int64
dtype: object
Thanks in advance!
CodePudding user response:
Simple application of np.where
:
df['col3'] = np.where(df.col2 < 1000, df.col2 * df.col1, df.col2)
col1 col2 col3
0 10000 10 100000
1 2000 20000 20000
2 300 300 90000
3 4000 4000 4000
4 5000 100 500000
CodePudding user response:
use np.where to create a condition, since these are number, we can check that column2 value is less than 1000
cond = (df['col2'] < 1000)
choice = (df['col1'] * df['col2'])
df['col3'] = np.where(cond, choice, df['col2'])
df
col1 col2 col3
0 10000 10 100000
1 2000 20000 20000
2 300 300 90000
3 4000 4000 4000
4 50000 500 25000000
CodePudding user response:
You can try Series.mask
df['col4'] = df['col2'].mask(df['col2'] < 1000, df['col2'] * df['col1'])
print(df)
col1 col2 col3 col4
0 10000 10 100000 100000
1 2000 20000 20000 20000
2 300 300 90000 90000
3 4000 4000 4000 4000
4 5000 100 500000 500000
CodePudding user response:
Vanilla pandas code using df.apply
def custom_fill(cols):
if cols[1] < 1000:
return cols[0] * cols[1]
else:
return cols[1]
df['col3'] = df[['col1','col2']].apply(custom_fill, axis=1)
Output:
col1 col2 col3
0 10000 10 100000
1 2000 20000 20000
2 300 300 90000
3 4000 4000 4000
4 50000 100 5000000