Find rows based on number of digits-CodePudding

I'm working on a dataframe that has 2000 rows but for this purpose I have created this simple data frame in which I want to find all rows containing 3 or less digits in the col2 column. Here is the dataframe:

d = {'col1': [10000, 2000,300,4000,50000], 'col2': [10, 20000, 300, 4000, 100]}
df = pd.DataFrame(data=d)

    col1    col2
0   10000   10
1   2000    20000
2   300     300
3   4000    4000
4   50000   100

Area     int64
Price    int64
dtype: object

After that I would like to create a new column col3 where the values from col2 column from those filtered rows (with 3 or less digits) will be multiplied by their values from the col1 column while the other rows stays the same.

Here's the expected output:

    col1    col2    col3
0   10000   10      100000
1   2000    20000   20000
2   300     300     90000
3   4000    4000    4000
4   5000    100     500000

col1    int64
col2    int64
col3    int64
dtype: object

Thanks in advance!

CodePudding user response：

Simple application of np.where:

df['col3'] = np.where(df.col2 < 1000, df.col2 * df.col1, df.col2)

    col1   col2      col3
0  10000     10    100000
1   2000  20000     20000
2    300    300     90000
3   4000   4000      4000
4   5000    100    500000

CodePudding user response：

use np.where to create a condition, since these are number, we can check that column2 value is less than 1000

cond = (df['col2'] < 1000)
choice = (df['col1'] * df['col2'])

df['col3'] = np.where(cond, choice, df['col2'])
df

    col1    col2    col3
0   10000   10  100000
1   2000    20000   20000
2   300 300 90000
3   4000    4000    4000
4   50000   500 25000000

CodePudding user response：

You can try Series.mask

df['col4'] = df['col2'].mask(df['col2'] < 1000, df['col2'] * df['col1'])

print(df)

    col1   col2    col3    col4
0  10000     10  100000  100000
1   2000  20000   20000   20000
2    300    300   90000   90000
3   4000   4000    4000    4000
4   5000    100  500000  500000

CodePudding user response：

Vanilla pandas code using df.apply

def custom_fill(cols):
    if cols[1] < 1000:
        return cols[0] * cols[1]
    else:
        return cols[1]

df['col3'] = df[['col1','col2']].apply(custom_fill, axis=1)

Output:

    col1   col2     col3
0  10000     10   100000
1   2000  20000    20000
2    300    300    90000
3   4000   4000     4000
4  50000    100  5000000