Home > Net >  How to replace values in column A with column B with conditions
How to replace values in column A with column B with conditions

Time:09-24

My dataframe lists some of the name I interested and may make some changes, what I want is keep the original name if option is blank, delete the name if it says delete and replace the name if it has something.

name option
A
B delete
C
D DDDD
E delete
F FF

The final dataframe I want is to change the names in a much more big dataframe and with other infos. may look like this

name info1 info2
A xxxx sth
A xxxx sth
C x blala
DDDD xx lalalaa
DDDD xx laa
DDDD xx laaaa
FF xxx aaaa

CodePudding user response:

For good performance first remove unnecessary rows and then set values by condition:

df1 = df[~df['option'].str.contains('delete')]
df1.loc[df['option'].ne('') | df['option'].notnull(), 'name'] = ''

Or:

df1['name'] = np.where(df['option'].ne('') | df['option'].notnull(), '', df1['name'])

CodePudding user response:

You can use case condition to replace based on option value.

val df = Seq(("A", ""), ("B", "delete"), ("C", ""), ("D", "DDDD"), ("E", "delete"), ("F", "FF")).toDF("name", "option")
df.createOrReplaceTempView("temp")

val query = """
select case when option = 'delete' then ''
            when option == '' then name
            else option
       end as name,
       option
from temp
"""
spark.sql(query).show()


 ---- ------ 
|name|option|
 ---- ------ 
|   A|      |
|    |delete|
|   C|      |
|DDDD|  DDDD|
|    |delete|
|  FF|    FF|
 ---- ------ 
  • Related