Home > front end >  PySpark : How to convert minutes to hours : minutes?
PySpark : How to convert minutes to hours : minutes?

Time:04-22

I will convert col in minutes to hours : minutes

col(min)
685

I will obtain

col(min) col1(h:min)
685 11:25

CodePudding user response:

Use the sql functions div and mod to get the quotient and remainder respectively, and then concatenate them.

df = df.withColumn('col1', F.expr('concat(div(col, 60), ":", mod(col, 60))'))

CodePudding user response:

You can use .map to transform data from an RDD into one or more columns.

Python builtin function divmod returns the quotient and remainder of an integer division. divmod(a, b) is equivalent to (a // b, a % b).

rdd = sc.parallelize([
    685, 180, 80
])

results = rdd.map(lambda x: divmod(x, 60))

print( results.collect() )
# [(11, 25), (3, 0), (1, 20)]

Or if you want the result as strings in format hh:mm, use str.format to format the values to your liking:

results = rdd.map(lambda x: '{:02d}:{:02d}'.format(*divmod(x, 60)))

print( results.collect() )
# ['11:25', '03:00', '01:20']

If you want to keep both the number of minutes and the resulting hh:mm string:

results = rdd.map(lambda x: (x, '{:02d}:{:02d}'.format(*divmod(x, 60))))

print( results.collect() )
# [(685, '11:25'), (180, '03:00'), (80, '01:20')]
  • Related