Home > Mobile >  Can't create a column in dataframe based on other columns. Tried several options - none worked.
Can't create a column in dataframe based on other columns. Tried several options - none worked.

Time:11-03

thanks for helping me out. I can't create new column in a dataframe.

So far I have tried using lambdas, isin method, contains method.

I have a dataframe with these values (first two columns are dtype = object, Column c is what i want to get):

Country code| Countries                  || Column c |
KR          | KR~CN_SG~PH                || Valid    |
RO          | CN~PK                      || Invalid  |
NL          | CZ_BE~NL_IT~DE             || Valid    |
SG          | HK~SK_DZ_AL_CN_GR_RU~SA~SG || Valid    |
US          | ZA~SE~ES~CH_UA             || Invalid  |

Valid - When Country Code is in Countries

Invalid - When it isn't

This is my first time doing code at my first Python job, sorry if this is stupid question :D

CodePudding user response:

Use list comprehension with numpy.where:

m = [x in y for x, y in zip(df['Country code'], df['Countries'])]
df['Column c'] = np.where(m, 'Valid','Invalid')

CodePudding user response:

You can use a single list comprehension:

df['Column c'] = ['Valid' if x in l else 'Invalid'
                  for x, l in zip(df['Country code'], df['Countries'])]

output:

  Country code                   Countries Column c
0           KR                 KR~CN_SG~PH    Valid
1           RO                       CN~PK  Invalid
2           NL              CZ_BE~NL_IT~DE    Valid
3           SG  HK~SK_DZ_AL_CN_GR_RU~SA~SG    Valid
4           US              ZA~SE~ES~CH_UA  Invalid
  • Related