JAR's signer information conflict with another class-CodePudding

I'm trying to load two jars into my AWS Glue/Spark read method but got an error:

An error occurred while calling o142.save.
: java.lang.SecurityException: class "com.microsoft.sqlserver.jdbc.ISQLServerBulkData"'s signer information does not match signer information of other classes in the same package
    at java.lang.ClassLoader.checkCerts(ClassLoader.java:891)
    at java.lang.ClassLoader.preDefineClass(ClassLoader.java:661)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:754)
    at java.security.SecureClas...

My code below, I tried multiple glue_dynamicFrame write methods but bulk insert into SQL erver is not working. According to MS these drivers should do the trick.

Any suggestions on fixing it are highly welcomed!

def write_df_to_target(self, df, schema_table):
    spark = self.gc.spark_session
    spark.builder.config('spark.jars.packages', 'com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8,com.microsoft.azure:spark-mssql-connector_2.12:1.1.0').getOrCreate()
    credentials = self.get_credentials(self.replica_connection_name)

    df.write \
        .format("com.microsoft.sqlserver.jdbc.spark") \
        .option("url", credentials["url"]   ";databaseName="   self.database_name) \
        .option("dbtable", schema_table) \
        .option("user", credentials["user"]) \
        .option("password", credentials["password"]) \
        .option("batchsize","50000") \
        .option("numPartitions","150") \
        .option("bulkCopyTableLock","true") \
        .save()

CodePudding user response：

Using com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8 is one thing but also you need proper version of MS' Spark SQL Connector

com.microsoft.azure:spark-mssql-connector_2.12_3.0:1.0.0-alpha and com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8 did not work for my case as I'm using AWS Glue 3.0 (which is Spark 3.1)

I had to switch to com.microsoft.azure:spark-mssql-connector_2.12:1.2.0 as it's Spark 3.1 compatible.

def write_df_to_target(self, df, schema_table):
    spark = self.gc.spark_session
    spark.builder.config('spark.jars.packages', 'com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8,com.microsoft.azure:spark-mssql-connector_2.12:1.2.0').getOrCreate()
    credentials = self.get_credentials(self.replica_connection_name)

    df.write \
        .format("com.microsoft.sqlserver.jdbc.spark") \
        .option("url", credentials["url"]   ";databaseName="   self.database_name) \
        .option("dbtable", schema_table) \
        .option("user", credentials["user"]) \
        .option("password", credentials["password"]) \
        .option("batchsize","100000") \
        .option("numPartitions","15") \
        .save()