Home > Mobile >  Is there a more effective way to generate this dataframe?
Is there a more effective way to generate this dataframe?

Time:06-07

I have a code which "converts" a dict into a pd.DataFrame. As result, I get the dataframe I need, but as I think code is not effective.

python
import datetime
import pandas as pd

data = {}
for index, row in get_data_row_by_row():
    data[index] = row

'''
As result i get something like

data = {"2022-04-22": {"Open": 4268.169485565509, "Close": 4225.4345703125, "Low": 4217.979029960617,
                        "High": 4331.431780377489},
        "2022-04-25": {"Open": 4237.487568541329, "Close": 4204.16748046875, "Low": 4171.766769167242,
                        "High": 4315.181737583676}}
'''

df = pd.DataFrame({'Date': [datetime.datetime.strptime(i, "%Y-%m-%d") for i in data.keys()],
                    'Open': [val['Open'] for key, val in data.items()],
                    'Close': [val['Close'] for key, val in data.items()], 'Low': [val['Low'] for key, val in data.items()],
                    'High': [val['High'] for key, val in data.items()]})
df = df.set_index('Date')

How to generate the same DataFrame in more effective way?

CodePudding user response:

How about:

out = pd.DataFrame.from_dict(data, orient='index').rename_axis(index='Date')
out.index = pd.to_datetime(out.index)

Output:

                   Open       Close          Low         High
Date                                                         
2022-04-22  4268.169486  4225.43457  4217.979030  4331.431780
2022-04-25  4237.487569  4204.16748  4171.766769  4315.181738

CodePudding user response:

Just use pd.DataFrame and then tranpose (.T) it:

df = pd.DataFrame(data).T.reset_index()

Output:

>>> df
        index         Open       Close          Low         High
0  2022-04-22  4268.169486  4225.43457  4217.979030  4331.431780
1  2022-04-25  4237.487569  4204.16748  4171.766769  4315.181738
  • Related