Home > other >  PySpark Streaming textFileStream how to support wildcards or regular expression to monitor the conte
PySpark Streaming textFileStream how to support wildcards or regular expression to monitor the conte

Time:09-22

In recent study PySpark Streaming, do a small program, real-time monitoring directory, please PySpark Streaming textFileStream how to read only the directory of the contents of a specific type of files, such as directory in the test might increase the TXT file and docx file at any time, how to implement only when adding TXT file PySpark Streaming will read?

The problem with the problem about https://issues.apache.org/jira/browse/SPARK-8605

Has been found through Google PySpark textFile, support wildcards in textFileStream does not support, but apparently someone put forward the solution, the reference link below:
https://issues.apache.org/jira/browse/SPARK-14976
But does not understand how to solve of, recognize how to deal with, please? Thank you very much!
  • Related