Home > Enterprise >  How do I create a new column in a df that is either 0 or 1 based off of other column values?
How do I create a new column in a df that is either 0 or 1 based off of other column values?

Time:03-10

I would like to make a new column in my df that is based off of values in other columns. I have read endless tutorials, but nothing has worked for me yet. I would like a new column "treatment" that is assigned a value of 0 or 1 based off if the value from the column "week" is between the values from columns week_begin and week_end.

This what I did:

def conditions(row):
    if row['week'] >= 'week_begin" & row['week'] <= 'week_end':
        return 1
    else:
        return 0

union_accident['treatment'] = union_accident.apply(conditions, axis=1)

union_accident.head()

This returns the error:

 '>=' not supported between instances of 'int' and 'list'

CodePudding user response:

your immediate error is because you're using single quotes everywhere and have a typo double quote at the end of week_begin on line 2.

But, fixing that, you can easily do this by directly comparing the columns to each other! No need for .apply

The sneaky .astype(int) will change this column from a boolean type (True/False) to a numeric type with values of (1/0)

union_accident['treatment'] = (
      union_accident['week_begin'] <= union_accident['week']
    & union_accident['week'] <= union_accident['week_end']
).astype(int)

But that's a lot of repetition of union_accident right there- you can also use the .eval method to do this in a much less verbose manner:

union_accident['treatment'] = union_accident.eval(
    '(week_begin <= week) & (week <= week_end)'
).astype(int)

CodePudding user response:

You can use slicing like this,Just Change the

Condition B>=3

to your Condition, i have Written it like this for generalization purposes

 df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])

df.loc[df["B"]>=3,"NewColumn"]=0

df.loc[df["B"]<=3,"NewColumn"]=1




    A   B   NewColumn
0   0   1   1.0
1   2   3   1.0
2   4   5   0.0
3   6   7   0.0
4   8   9   0.0

For More Information https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  • Related