Home > database >  Pyarrow Join (int8 and int16)
Pyarrow Join (int8 and int16)

Time:09-19

I have two Pyarrow Tables and want to join both.

A.join(
        right_table=B, keys="A_id", right_keys="B_id"
    )

Now I got the following error:

{ArrowInvalid} Incompatible data types for corresponding join field keys: FieldRef.Name(A_id) of type int8 and FieldRef.Name(B_id) of type int16

What is the preferred way to solve this issue?

I did not find a way to cast one column to either int8 or int16 in pyarrow Table.

Thanks

CodePudding user response:

you need to change field type of one of your tables.

How to change 'A_id' field for your table A

# change type of 'A_id'
schema = A.schema
for num, field in enumerate(schema):
    if field.name == 'A_id':
        new_field = field.with_type(pa.int16()) # return a copy of field with new type
        schema = schema.remove(num) # remove old field 
        schema = schema.insert(num, new_field) # add new field 

A = A.cast(target_schema=schema) # update new schema to Table A
# join tables
A.join(
        right_table=B, keys="A_id", right_keys="B_id"
    )
  • Related