I am currently trying to create a new column (lets call it col C) within a dataframe that adds one to each row value if that same row has a value in another column that is greater than or less than a specified criteria.
So if the criteria is value in column A >2 and value in column B >= 4 column C looks as follows:
Column A | Column B | Coulumn C |
---|---|---|
1 | 4 | 1 |
3 | 1 | 1 |
4 | 5 | 2 |
1 | 1 | 0 |
Ive tried creating separate dataframes of the rows that meet each criteria and then dropping the ones that don't from the dataframe but there has to be a much simpler way.
CodePudding user response:
You can try the below either of the 2 snippets:
Mostly prefer 2nd snippet
import pandas as pd
import numpy as np
df = pd.DataFrame(data= [[1,4],[3,1],[4,5],[1,1]], columns=['a','b'])
df['c'] = np.where(((df['a']>2) & (df['b']>=4)), 2, 0)
df['c'] = np.where((((df['a']>2) | (df['b']>=4))&(df['c']==0)), 1, df['c'])
print(df)
OR
import pandas as pd
import numpy as np
df = pd.DataFrame(data= [[1,4],[3,1],[4,5],[1,1]], columns=['a','b'])
df['c'] = 0
df['c'] = np.where((df['a']>2), df['c'] 1, df['c'])
df['c'] = np.where((df['b']>=4), df['c'] 1, df['c'])
print(df)
CodePudding user response:
you can also do something like this:
df = pd.DataFrame(data= [[1,4],[3,1],[4,5],[1,1]], columns=['a','b'])
df = df.assign(c=(df.a>2).astype(int) (df.b>=4).astype(int))
>>> df
'''
a b c
0 1 4 1
1 3 1 1
2 4 5 2
3 1 1 0