I have a dataframe in PySpark with 1 row 1 column - json
-----------------------------------------------------------------------------------------
|json
-----------------------------------------------------------------------------------------
|[{"a":{"b":0,"c":{"50":0.005,"60":0,"100":0},"d":0.01,"e":0,"f":2}}]|
-----------------------------------------------------------------------------------------
I need to extract the json value and post it via rest using requests.
CodePudding user response:
from pyspark.sql import SparkSession
import json
spark = (SparkSession.builder.appName("AuthorsAges").getOrCreate())
# Creating the DataFrame
data_df = spark.createDataFrame([["[{\"a\":{\"b\":0,\"c\":
{\"50\":0.005,\"60\":0,\"100\":0},\"d\":0.01,\"e\":0,\"f\":2}}]"]])
data_df.show(1, False)
extract_text = data_df.collect()[0][0]
extract_json = json.loads(extract_text[1:-1])
# you can access any of the josn fields like this afterwards
print(extract_json['a']['c'])