Home > database >  Modern way to stream data to a web based (angular) front-end
Modern way to stream data to a web based (angular) front-end

Time:10-02

tldr: I want to set up an angular application with a node backend that displays data in realtime, with the source of the data-stream being a kafka-stream, the data of which is stored in a postgres database.

I'm new to some of the topics yet confident I can get it up and running, but most likely not in a "best-practice" way. I'm having trouble figuring out what a modern and efficient approach to this is. Ideally I'm looking for a high-level overview on how to approach this in a best-practice way.

I currently have a python-kafka consumer, listening to a stream and storing the data in a postgres database. What is a good approach to serve this data in realtime to many clients? Do I use websockets or http to stream the data from the database? Should I ditch python and create a consumer in NodeJS that forwards it straight to the clients and thus optionally skipping the database?

CodePudding user response:

Yes, you'd need to use websockets for a stream of updates. Kafka does not help with this, though, you need to find some solution to combine Kafka with a websocket client.

Such as socket.io ...

// Listen for Kafka
consumer.on('message', ({ value, }) => {
    // Parse the JSON value into an object
    const { payload, } = JSON.parse(value)

    console.log('\n\nemitting from kafka:', payload)

    // Emit the message through all connected sockets
    io.emit("kafka-event", payload)

Keep in mind, the above code will only work on one client. New sockets do not start new consumers, so will only see updates as of the current offset of the internal Kafka consumer. If you start multiple Kafka consumers (or multiple Node backends), then you may only see a subset of Kafka partitions being consumed in each socket event...

Otherwise, there's nothing unique to Kafka about the question. You would write a loop (e.g. setTimeout() / setInterval()) to query some HTTP API (not the database directly) for all records, and/or new records after the last time you've polled.
Or, depending on your use case, query the whole database table/collection add some refresh button to accurately capture deletions (unless you have a websocket to send individual delete events as well, and can update the DOM with those events).

currently have a python-kafka consumer, listening to a stream and storing the data in a postgres database

While that may work, Kafka Connect may scale better.

Or, Kafka Streams supports KV queries, so don't need external Postgres database, depending on your query patterns.

thus optionally skipping the database?

If you don't care about retention of historical events, then you don't need any database, no. You'd only then get events in your UI since the consumer-socket gets established, then lose all history on a refresh.

  • Related