Home > Enterprise >  Do we have a Firebase I/O connector for Apache Beam?
Do we have a Firebase I/O connector for Apache Beam?

Time:10-15

I tried looking for Firebase I/O connectors for Firebase but wasn't able to find one. Can someone please help me in doing so, or someone has a Firebase I/O connector with them to read and write my files, please help me with it.

Thanks in Advance.

CodePudding user response:

Go to Apache Beam official website

Find Documentation in the header.

Inside you'll see I/O Connectors, click on it: I/O Connectors.

Find Firestore IO in the list, then read its Javadoc.

Or maybe you need Datastore IO (for older versions I guess), here's the Javadoc.

Please note that you need to choose the data base you're using.

CodePudding user response:

There is the following interesting link from the official Google documentation, showing a read and write example with Beam Java :

https://cloud.google.com/blog/topics/developers-practitioners/using-firestore-and-apache-beam-data-processing

Pipeline pipeline = Pipeline.create(options);

String collectionGroupId = "collection-group-name";
RpcQosOptions rpcQosOptions = RpcQosOptions.newBuilder()
    .withHintMaxNumWorkers(options.as(DataflowPipelineOptions.class)
    .getMaxNumWorkers())
    .build();

pipeline
       .apply(Create.of(collectionGroupId))
       .apply(new CreatePartitionQueryRequest(rpcQosOptions.getHintMaxNumWorkers()))
       .apply(FirestoreIO.v1().read().partitionQuery().withNameOnlyQuery().build())
       .apply(FirestoreIO.v1().read().runQuery().build())
       .apply(MapElements.into(TypeDescriptors.strings()).via(
           (runQueryResponse) -> runQueryResponse.getDocument().getName())
       )
       .apply(ParDo.of(new CreateDeleteOperation()))
       .apply("shuffle writes", Reshuffle.viaRandomKey())
       .apply(
               FirestoreIO.v1().write()
                       .batchWrite()
                       .withRpcQosOptions(rpcQosOptions)
                       .build()
       );

pipeline.run().waitUntilFinish();

The link to the Javadoc :

https://beam.apache.org/releases/javadoc/2.41.0/org/apache/beam/sdk/io/gcp/firestore/FirestoreIO.html

You can also check this link showing an example of write with FirestoreIO :

Add document to Firestore from Beam with auto generated ID

For Python, I think there is no open source IO now on Beam, but you can use the Firestore client in a ParDo and DoFn, here a link showing an example :

Using FireStore in Google Dataflow

  • Related