I have two Pyarrow Tables and want to join both.
A.join(
right_table=B, keys="A_id", right_keys="B_id"
)
Now I got the following error:
{ArrowInvalid} Incompatible data types for corresponding join field keys: FieldRef.Name(A_id) of type int8 and FieldRef.Name(B_id) of type int16
What is the preferred way to solve this issue?
I did not find a way to cast one column to either int8 or int16 in pyarrow Table.
Thanks
CodePudding user response:
you need to change field type of one of your tables.
How to change 'A_id' field for your table A
# change type of 'A_id'
schema = A.schema
for num, field in enumerate(schema):
if field.name == 'A_id':
new_field = field.with_type(pa.int16()) # return a copy of field with new type
schema = schema.remove(num) # remove old field
schema = schema.insert(num, new_field) # add new field
A = A.cast(target_schema=schema) # update new schema to Table A
# join tables
A.join(
right_table=B, keys="A_id", right_keys="B_id"
)