Home > Blockchain >  Make a one row dataframe from 1 list - Pyspark
Make a one row dataframe from 1 list - Pyspark

Time:10-24

I searched here but I didn't find anything that worked for me. Basically, I have only 1 row list (with some columns) and I have to write them in a parquet table. I need to "cast" that from list to DF, but for only 1 row, I have many problems!

from pyspark.sql import Window,Row
from pyspark.sql import functions as F
from pyspark.sql.session import SparkSession
from pyspark.sql.types import *

tablename='table'
start_time = F.lit(datetime.datetime.now())
count_1 = 0
count_2 = 0
count_3 = 0

list = [F.lit(start_time),
        F.lit(tablename),
        F.lit(count_1),
        F.lit(count_2),
        F.lit(count_3),
        F.current_timestamp()]

columns = ['start_time', 'table', 'count_1', 'count_2', 'count_3', 'end_time']

When I try to use parallelize or .toDF it returns some error.

Does anyone know how can I do?

CodePudding user response:

If you modify your dataset to remove spark F function & types, it works:

from pyspark.sql import Window,Row
from pyspark.sql import functions as F
from pyspark.sql.session import SparkSession
from pyspark.sql.types import *
import datetime

tablename='table'
start_time = datetime.datetime.now()
count_1 = 0
count_2 = 0
count_3 = 0

list = [(start_time,
        tablename,
        count_1,
        count_2,
        count_3,
        datetime.datetime.now())]

columns = ['start_time', 'table', 'count_1', 'count_2', 'count_3', 'end_time']

df = spark.createDataFrame(list, columns)
df.show()

 -------------------- ----- ------- ------- ------- -------------------- 
|          start_time|table|count_1|count_2|count_3|            end_time|
 -------------------- ----- ------- ------- ------- -------------------- 
|2022-10-21 16:39:...|table|      0|      0|      0|2022-10-21 16:39:...|
 -------------------- ----- ------- ------- ------- -------------------- 

PS - You are clouding list type with variable name list. It did not cause the issue; but may lead to some other issue.

  • Related