I got data from yahoo finance and somehow the first date column is there when i just look at the data:
Open High Low Close Volume Dividends Stock Splits
Date
2015-01-02 00:00:00-05:00 40.724155 41.387470 40.619421 40.811432 27913900 0.0 0.0
2015-01-05 00:00:00-05:00 40.471039 40.785243 40.366306 40.436131 39673900 0.0 0.0
2015-01-06 00:00:00-05:00 40.479777 40.802706 39.746637 39.842644 36447900 0.0 0.0
2015-01-07 00:00:00-05:00 40.130662 40.549598 39.702999 40.348858 29114100 0.0 0.0
2015-01-08 00:00:00-05:00 40.802692 41.675477 40.776510 41.535831 29645200 0.0 0.0
but not there when in use info()
:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1257 entries, 2015-01-02 00:00:00-05:00 to 2019-12-30 00:00:00-05:00
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Open 1257 non-null float64
1 High 1257 non-null float64
2 Low 1257 non-null float64
3 Close 1257 non-null float64
4 Volume 1257 non-null int64
5 Dividends 1257 non-null float64
6 Stock Splits 1257 non-null float64
dtypes: float64(6), int64(1)
memory usage: 110.9 KB
Is there a way to get it there and define it as datetime64[ns]
this is the code for getting the data from yahoo:
company_name = "MSFT"
company = tweets[tweets['ticker_symbol'] == company_name]
company_stock = yf.Ticker(company_name).history(start=min(company.date).date(),end=max(company.date).date())
Thank you in advance
I hope I can get the date into the dataframe and change it as datetime[ns]
CodePudding user response:
That date value is not a column in the dataframe.The dataframe is currently indexed by date.
import yfinance as yf
company_name = "MSFT"
df = yf.Ticker(company_name).history()
print(list(df.columns))
# ['Open', 'High', 'Low', 'Close', 'Volume', 'Dividends', 'Stock Splits']
You can add the date index as a new column in the dataframe.
df['date'] = df.index
print(list(df.columns))
# ['Open', 'High', 'Low', 'Close', 'Volume', 'Dividends', 'Stock Splits', 'date']
The new date column is in datetime64[ns] format:
print(df.dtypes['date'])
# datetime64[ns]
# or to see all dtypes: print(df.dtypes)