How to convert two columns of dataframe into an orderedDict in Python?-CodePudding

I have a table named tableTest like this:

startDate	endDate
2022-12-15	2022-12-18
2022-12-19	2022-12-21
2022-12-22	2022-12-24
2022-12-26	2022-12-27
2022-12-29	2022-12-30
2022-12-02	2022-12-04
2022-12-06	2022-12-07
2022-12-07	2022-12-08
2022-12-09	2022-12-09
2022-12-13	2022-12-14

I need to loop the key-value pairs consisting of startDate and endDate by original order.

What I did：

import pandas as pd

data = [
    ("2022-12-15", "2022-12-18"),
    ("2022-12-19", "2022-12-21"),
    ("2022-12-22", "2022-12-24"),
    ("2022-12-26", "2022-12-27"),
    ("2022-12-29", "2022-12-30"),
    ("2022-12-02", "2022-12-04"),
    ("2022-12-06", "2022-12-07"),
    ("2022-12-07", "2022-12-08"),
    ("2022-12-13", "2022-12-14"),
    ("2023-01-01", "2023-01-03"),
]

df = spark.createDataFrame(data).toDF(*('startDate', 'endDate')).toPandas()
dictTest = df.set_index('startDate')['endDate'].to_dict()

print(dictTest)

for k,v in dictTest.items():
    print(f'startDate is {k} and corresponding endDate is {v}.')

The above code can indeed convert these two columns to dict, but dict is unordered, so I lost the original order of these two columns.

Thank you in advance.

CodePudding user response：

You can use the into parameter of .to_dict to pass in an OrderedDict:

from collections import OrderedDict 
dictTest = df.set_index('startDate')['endDate'].to_dict(into=OrderedDict)

See the docs here.

CodePudding user response：

You can just use iterrows to iterate in the original order as long as tableTest is a dataframe.

for index, row in tableTest.iterrows():
    startDate = row['startDate']
    endDate = row['endDate']
    print(f'startDate is {startDate} and corresponding endDate is {endDate}.')