Home > database >  Two conditions in "if" part of if/else statement using Pyspark
Two conditions in "if" part of if/else statement using Pyspark

Time:03-16

I need to interrupt the program and throw the exception below if the two conditions are met, otherwise have the program continue. This works fine while only using the 1st condition, but yields an error when using both conditions. The below code should throw the exception if the DF is non-zero and the value for DF.col1 is not 'string.' Any tips to get this working?

if (DF.count() > 0) & (DF.col1 != 'string'): 
  raise Exception("!!!COUNT IS NON-ZERO, SO ADJUSTMENT IS NEEDED!!!")
else: 
  pass 

This throws the error:

" Py4JError: An error occurred while calling o678.and. Trace: 
py4j.Py4JException: Method and([class java.lang.Integer]) does not exist "

Some sample data:

from pyspark.sql.types import StructType,StructField, StringType, IntegerType

data2 = [("not_string","test")]

schema = StructType([ \
    StructField("col1",StringType(),True), \
    StructField("col2",StringType(),True) \
  ])
 
DF = spark.createDataFrame(data=data2,schema=schema)
DF.printSchema()
DF.show(truncate=False)

CodePudding user response:

In Python, the & operator is a bitwise operator that acts on bits to perform a bit by bit operation. For "and" logic in conditions you must use and:

if (DF.count() > 0) and (DF.col1 != 'string'): 
  raise Exception("!!!COUNT IS NON-ZERO, SO ADJUSTMENT IS NEEDED!!!")
else: 
  pass 

CodePudding user response:

IIUC you want to raise an exception if there are any rows in your dataframe where the value of col1 is unequal to 'string'.

You can do this by using a filter and a count. If there are any rows unequal to the value 'string' the count will be bigger than 0 which evaluates to True raising your Exception.

data2 = [("not_string","test")]

schema = StructType([ \
    StructField("col1",StringType(),True), \
    StructField("col2",StringType(),True) \
  ])
 
DF = spark.createDataFrame(data=data2,schema=schema)

if DF.filter(DF.col1 != 'string').count():
    raise Exception("!!!COUNT IS NON-ZERO, SO ADJUSTMENT IS NEEDED!!!")

Exception: !!!COUNT IS NON-ZERO, SO ADJUSTMENT IS NEEDED!!!
  • Related