Home > Back-end >  How to join 2 tables using the lookupRecord processor in Apache Nifi?
How to join 2 tables using the lookupRecord processor in Apache Nifi?

Time:10-14

i have 2 tables from different databases i want to join.

Table 1: "person" in MySQL (schema displayed in JSON)

[
    "person_id": 1,
    "person_name": "John"
]

Table 2: "jobs" in MongoDB

[
    "job_id": 21,
    "person_id": 1,
    "job": "Teacher"
]

So as you can assume, i want to output the data this way below (combining them via the "person_id"-key:

[
    "person_id": 1,
    "person_name": "John",
    "job": "Teacher"
]

And this should be written in a new table "personWithJob" in MySQL. I found out that the "LookupRecord"-processor could do this, but i dont know how to configure it correctly.

I also found many solutions to this processor for csv-Files, but not for Database / Json format.

Thanks for helping me, Lukas :)

CodePudding user response:

So first you need to get the Person from MySQL, using E.g. ExecuteSQLRecord

You now have a FlowFile with one or many Records, each representing a Person.

Now you want to enrich each Record with the results from MongoDB, using LookupRecord configured with a MongoDBLookupService.

You will configure LookupRecord to pass the person_id through to the MongoDBLookupService. This is done by adding a new Dynamic Property to LookupRecord with the value being a valid RecordPath inside your Person Record e.g. /person_id.

Then configure MongoDBLookupServce to return the field you are interested in, e.g. setting Lookup Value Field to job.

You can then set Result RecordPath on the LookupRecord to /job to insert the returned value into the original Person Record into a field called job.

You can now use PutDatabaseRecord to insert the enriched Person record into the result table.

There's a few more options you can play with on LookupRecord, but that should be enough for this simple use case.

If you want more detail on using LookupRecord, I wrote a post about it here that gives an example.

CodePudding user response:

aggregate

db.persons.aggregate([
  {
    "$lookup": {
      "from": "jobs",
      "localField": "person_id",
      "foreignField": "person_id",
      "as": "job_docs"
    }
  },
  {
    "$project": {
      "person_id": 1,
      "person_name": 1,
      "job": {
        "$first": "$job_docs.job"
      }
    }
  }
])

mongoplayground

  • Related