How do I add a column with header MergeName
to a Pandas DataFrame
that has the text from column ShortName
, but if ShortName
is "None", then the MergeName
value should equal the Plaintiffs
column value?
This is the Pandas DataFrame
data:
Plaintiffs Gender ShortName
0 None None None
1 None None None
2 Donald Duck M None
3 Minnie Mouse F Minnie
4 None None None
5 John Doe M Doe
6 None None None
7 None None None
8 None None None
9 None None None
10 None None None
Thanks!
I've tried so many different things and nothing seems to work. Usually the result is only all the data from the else condition is added to the MergeName
column including "None" values. Code I've tried include:
PlaintiffsTbl['MergeName'] = np.where(PlaintiffsTbl['ShortName'] is None, PlaintiffsTbl['Plaintiffs'], PlaintiffsTbl['ShortName'])
PlaintiffsTbl['MergeName'] = PlaintiffsTbl['ShortName']
PlaintiffsTbl.loc[PlaintiffsTbl['MergeName'] == None, 'MergeName'] = PlaintiffsTbl['Plaintiffs']
PlaintiffsTbl['MergeName'] = [PlaintiffsTbl['Plaintiffs'] if PlaintiffsTbl['ShortName'] is None else PlaintiffsTbl['ShortName']]
Thank you Amir Hossein Shahdaei! This code does what I was looking for:
PlaintiffsTbl['MergeName'] = PlaintiffsTbl['ShortName']
PlaintiffsTbl['MergeName'] = PlaintiffsTbl['MergeName'].fillna(PlaintiffsTbl['Plaintiffs'])
CodePudding user response:
You can use .fillna function and after making MergeName from ShortName col fill null values of it with MergeName col
df = pd.DataFrame(
data = [
['a', None],
['b', 1],
[None, 2],
[None, None],],
columns = ['Plaintiffs', 'ShortName']
)
df['MergeName'] = df['ShortName']
df['MergeName'] = df['MergeName'].fillna(df['Plaintiffs'])
df
Plaintiffs ShortName MergeName
0 a NaN a
1 b 1.0 1.0
2 None 2.0 2.0
3 None NaN None
CodePudding user response:
Example
data = [[None, None, None, None], [None, None, None, None],
['Donald Duck', 'M', None, 'Donald Duck'], ['Minnie Mouse', 'F', 'Minnie', 'Minnie'],
[None, None, None, None], ['John Doe', 'M', 'Doe', 'Doe']]
df = pd.DataFrame(data, columns=['Plaintiffs', 'Gender', 'ShortName', 'MergeName'])
df
Plaintiffs Gender ShortName
0 None None None
1 None None None
2 Donald Duck M None
3 Minnie MouseF Minnie
4 None None None
5 John Doe M Doe
Code
df['MergeName'] = df['ShortName'].fillna(df['Plaintiffs'])
df
Plaintiffs Gender ShortName MergeName
0 None None None None
1 None None None None
2 Donald Duck M None Donald Duck
3 Minnie MouseF Minnie Minnie
4 None None None None
5 John Doe M Doe Doe
CodePudding user response:
You can use the np.where
like this, not like your first trying:
PlaintiffsTbl['MergeName'] = np.where(PlaintiffsTbl['ShortName'], PlaintiffsTbl['ShortName'], PlaintiffsTbl['Plaintiffs'])
For example, the full code is as follows:
import pandas as pd
import numpy as np
PlaintiffsTbl = pd.DataFrame({
'Plaintiffs': [None, None, 'Donald Duck', 'Minnie Mouse', None, 'John Doe', None, None, None, None, None],
'Gender': [None, None, 'M', 'F', None, 'M', None, None, None, None, None],
'ShortName': [None, None, None, 'Minnie', None, 'Doe', None, None, None, None, None],
})
print(PlaintiffsTbl)
"""
Plaintiffs Gender ShortName
0 None None None
1 None None None
2 Donald Duck M None
3 Minnie Mouse F Minnie
4 None None None
5 John Doe M Doe
6 None None None
7 None None None
8 None None None
9 None None None
10 None None None
"""
PlaintiffsTbl['MergeName'] = np.where(PlaintiffsTbl['ShortName'], PlaintiffsTbl['ShortName'], PlaintiffsTbl['Plaintiffs'])
print(PlaintiffsTbl)
"""
Plaintiffs Gender ShortName MergeName
0 None None None None
1 None None None None
2 Donald Duck M None Donald Duck
3 Minnie Mouse F Minnie Minnie
4 None None None None
5 John Doe M Doe Doe
6 None None None None
7 None None None None
8 None None None None
9 None None None None
10 None None None None
"""
For more information about np.where
, see https://numpy.org/doc/stable/reference/generated/numpy.where.html
CodePudding user response:
import pandas as pd
import numpy as np
di = {
'Plaintiffs': ['Donald Duck', 'Minnie Mouse', None],
'ShortName': [None, 'Minnie', None]
}
d = pd.DataFrame(di)
d
yields
Plaintiffs ShortName
0 Donald Duck None
1 Minnie Mouse Minnie
2 None None
if it's just a simply one branching
cond = (d['ShortName'].isna()) & (d['Plaintiffs'].notna())
d['MergeName'] = np.where(cond, d['Plaintiffs'], d['ShortName'])
d
or this (suitable with more conditions and choices)
conditions = [
(d['ShortName'].isna()) & (d['Plaintiffs'].notna())
]
choices = [d['Plaintiffs']]
d['MergeName'] = np.select(conditions, choices, default=d['ShortName'])
d
yields the same
Plaintiffs ShortName MergeName
0 Donald Duck None Donald Duck
1 Minnie Mouse Minnie Minnie
2 None None None
if it has more than one choice, just add into the list
conditions = [
(d['ShortName'].isna()) & (d['Plaintiffs'].notna()),
d['ShortName'].notna()
]
choices = [d['Plaintiffs'], d['ShortName']]
d['MergeName'] = np.select(conditions, choices)
d
yields
Plaintiffs ShortName MergeName
0 Donald Duck None Donald Duck
1 Minnie Mouse Minnie Minnie
2 None None 0