Home > Mobile >  How to alter certain strings in a pandas column if value in another column is x
How to alter certain strings in a pandas column if value in another column is x

Time:03-03

I have a simple df.

    Genotype    freq
0   HET         0/1
1   REF         0/1
2   HOM         0/1
3   HOM         1/1

I would like to change 'HOM' to 'REF' if 'freq' == '0/1' or '1/0'. I would not like to alter any 'HET' rows. I have attempted to do this based on other answers in stack but have had little success. My attempts have been pasted below.

df = {'Genotype':  ['HET', 'REF', 'HOM', 'HOM'],
    'freq': ['0/1', '0/1', '0/1', '1/1']
    }

df = pd.DataFrame(df)

catch=['0/1', '1/0']
#attempt 1 - error: For argument "inplace" expected type bool, received  type int.
df.where(df['Genotype'] != 'HET', df.loc[df.freq.isin(catch), 'Genotype'] == 'REF', 0)
#attempt 2 - Ignores HET but adds TRUE/FALSE to other rows - looks messy.
df['Genotype']=df['Genotype'].apply(lambda x: 'HET' if x =='HET' else df.loc[df.freq.isin(catch), 'Genotype'] == 'REF')
#attempt 3 - Converts all '0/1' to REF
for index, row in df.iterrows():
    if row['Genotype'] == 'HOM':
        df.loc[df.freq.isin(catch), 'Genotype'] = 'REF'

If possible, is there a simple way to perform this in python/pandas without creating a new object - the indexes are important inside the larger function I have. Cheers.

CodePudding user response:

You need chain both conditions by & for bitwise AND:

catch=['0/1', '1/0']
df.loc[df.freq.isin(catch) & df['Genotype'].ne('HET'), 'Genotype'] = 'REF'
print (df)
  Genotype freq
0      HET  0/1
1      REF  0/1
2      REF  0/1
3      HOM  1/1
  • Related