Home > Back-end >  from_json returns null in Apache Spark Kafka readStream
from_json returns null in Apache Spark Kafka readStream

Time:04-05

I'm trying to read kafka topic and display the data on console using pySpark. I've defined the from_json schema and trying to match and display it. However, the df returns nulls.

Original object in kafka topic and schema are blow.

{
  "kind": "youtube#videoListResponse",
  "etag": "jUow4VqgbKTDD9d1QI8TBQdM0po",
  "items": [
    {
      "kind": "youtube#video",
      "etag": "SRkYji_KdZvK3LDoACVdkHcm-Og",
      "id": "E4R_WJBqaaQ",
      "snippet": {
        "publishedAt": "2022-03-30T13:51:05Z",
        "channelId": "UCw9DyZg3_F0bIks2jrEgQAA",
        "title": "Brother Job            
  • Related