Transactions and batched writes can be used to write multiple documents by means of an atomic operation.
Documentation says that Using the Cloud Firestore client libraries, you can group multiple operations into a single transaction.
I cannot understand what is the meaning of client libraries here and if it's correct to use Transactions and batched writes within a Cloud Function.
Example given: suppose in the database I have 3 elements (which doc IDs are A, B, C). Now I need to insert 3 more elements (which doc IDs are C, D, E). The Cloud Function should add just the latest ones and send a Push Notification to the user telling him that 2 new documents are available.
The doc ID could be the same but since I need to calculate how many documents are new (the ones that will be inserted) I need a way to read the doc ID first and check for its existence. Hence, I'm wondering if Transactions fit Cloud Functions or not.
Also, each transaction or batch of writes can write to a maximum of 500 documents. Is there any other way to overcome this limit within a Cloud Function?
CodePudding user response:
Firestore Transaction behaviour is different between the Clients SDKs (JS SDK, iOS SDK, Android SDK , ...) and the Admin SDK (a set of server libraries), which is the SDK we use in a Cloud Function. More explanations on the differences here in the documentation.
Because of the type of data contention used in the Admin SDK you can, with the getAll()
method, retrieve multiple documents from Firestore and hold a pessimistic lock on all returned documents.
So this is exactly the method you need to call in your transaction: you use getAll()
for fetching documents C, D & E and you detect that only C is existing so you know that you need to only add D and E.
Concretely, it could be something along the following lines:
const db = admin.firestore();
exports.lorenzoFunction = functions
.region('europe-west1')
.firestore
.document('tempo/{docId}') //Just a way to trigger the test Cloud Function!!
.onCreate(async (snap, context) => {
const c = db.doc('coltest/C');
const d = db.doc('coltest/D');
const e = db.doc('coltest/E');
const docRefsArray = [c, d, e]
return db.runTransaction(transaction => {
return transaction.getAll(...docRefsArray).then(snapsArray => {
let counter = 0;
snapsArray.forEach(snap => {
if (!snap.exists) {
counter ;
transaction.set(snap.ref, { foo: "bar" });
} else {
console.log(snap.id " exists")
}
});
console.log(counter);
return;
});
});
});
To test it: Create one of the C, D or E doc in the coltest
collection, then create a doc in the tempo
collection (Just a simple way to trigger this test Cloud Function): the CF is triggered. Then look at the coltest
collection: the two missing docs were created; and look a the CF log: counter = 2.
Also, each transaction or batch of writes can write to a maximum of 500 documents. Is there any other way to overcome this limit within a Cloud Function?
AFAIK the answer is no.
CodePudding user response:
There used to also be a one second delay required as well between 500 record chunks. I wrote this a couple of years ago. The script below reads the CSV file line by line, creating and setting a new batch object for each line. A counter creates a new batch write per 500 objects and finally asynch/await is used to rate limit the writes to 1 per second. Last, we notify the user of the write progress with console logging. I had published an article on this here >> https://hightekk.com/articles/firebase-admin-sdk-bulk-import
NOTE: In my case I am reading a huge flat text file (a manufacturers part number catalog) for import. You can use this as a working template though and modify to suit your data source. Also, you may need to increase the memory allocated to node for this to run:
node --max_old_space_size=8000 app.js
The script looks like:
var admin = require("firebase-admin");
var serviceAccount = require("./your-firebase-project-service-account-key.json");
var fs = require('fs');
var csvFile = "./my-huge-file.csv"
var parse = require('csv-parse');
require('should');
admin.initializeApp({
credential: admin.credential.cert(serviceAccount),
databaseURL: "https://your-project.firebaseio.com"
});
var firestore = admin.firestore();
var thisRef;
var obj = {};
var counter = 0;
var commitCounter = 0;
var batches = [];
batches[commitCounter] = firestore.batch();
fs.createReadStream(csvFile).pipe(
parse({delimiter: '|',relax_column_count:true,quote: ''})
).on('data', function(csvrow) {
if(counter <= 498){
if(csvrow[1]){
obj.family = csvrow[1];
}
if(csvrow[2]){
obj.series = csvrow[2];
}
if(csvrow[3]){
obj.sku = csvrow[3];
}
if(csvrow[4]){
obj.description = csvrow[4];
}
if(csvrow[6]){
obj.price = csvrow[6];
}
thisRef = firestore.collection("your-collection-name").doc();
batches[commitCounter].set(thisRef, obj);
counter = counter 1;
} else {
counter = 0;
commitCounter = commitCounter 1;
batches[commitCounter] = firestore.batch();
}
}).on('end',function() {
writeToDb(batches);
});
function oneSecond() {
return new Promise(resolve => {
setTimeout(() => {
resolve('resolved');
}, 1010);
});
}
async function writeToDb(arr) {
console.log("beginning write");
for (var i = 0; i < arr.length; i ) {
await oneSecond();
arr[i].commit().then(function () {
console.log("wrote batch " i);
});
}
console.log("done.");
}