I want to how we can assign one column value to another column if it has null or 0 value I have a dataframe like this:
id column1 column2
5263 5400 5400
4354 6567 Null
5656 5456 5456
5565 6768 3489
4500 3490 Null
The Expected Output is
id column1 column2
5263 5400 5400
4354 6567 6567
5656 5456 5456
5565 6768 3489
4500 3490 3490
that is, if df['column2'] = Null/0 then it has take df['column1'] value.
Can someone explain, how can I achieve my desired output?
CodePudding user response:
Here's my suggestion. Not sure whether it is the fastest, but it should work here ;)
#we start by creating an empty list
column2 = []
#for each row in the dataframe
for i in df.index:
# if the value col2 is null or 0, then it takes the value of col1
if df.loc[i, 'column2'] in ['null', 0]:
column2.append(df.loc[i, 'column1'])
#else it takes the value of column 2
else:
column2.append(df.loc[i, 'column2'])
#we replace the current column 2 by the new one !
df['column2'] = column2```
CodePudding user response:
Update using only Native Pandas Functionality
#Creates boolean array conditionCheck, checking conditions for each row in df
#Where() will only update when conditionCheck == False, so inverted boolean values using "~"
conditionCheck = ~((df['column2'].isna()) | (df['column2']==0))
df["column2"].where(conditionCheck,df["column1"],inplace=True)
print(df)
Code to Generate Sample DataFrame
Changed row 3 of column2 to 0 to test all scenarios
import numpy as np
import pandas as pd
data = [
[5263,5400,5400]
,[4354,6567,None]
,[5656,5456,0]
,[5565,6768,3489]
,[4500,3490,None]
]
df = pd.DataFrame(data,columns=["id","column1","column2"],dtype=pd.Int64Dtype())
CodePudding user response:
Similar question was already solved here.
"Null" keyword does not exist in python. Empty cells in pandas have np.nan type. So assuming you mean np.nans, one good way to achieve your desired output would be:
Create a boolean mask to select rows with np.nan or 0 value and then copy when mask is True.
mask = (df['column2'].isna()) | (df['column2']==0)
df.loc[mask, "column2"] = df.loc[mask, "column1"]
CodePudding user response:
Based on the answers to this similar question, you can do the following:
Using
np.where
:df['column2'] = np.where((df['column2'] == 'Null') | (df['column2'] == 0), df['column1'], df['column2'])
Instead, using only
pandas
and Python:df['column2'][(df['column2'] == 0) | (df['column2'] == 'Null')] = df['column1']