I am trying to use Panda to convert a list
to a DataFrame
. Every time I am trying to conver the list to a DataFrame I get a first row with 0
and it does not work correctly?
Code:
import pandas as pd
df = pd.DataFrame(list(data))
print(df)
Input:
['"DateTime","mm"', ['"2016-10-14 00:00:00"', '1.1'], ['"2016-10-15 00:00:00"', '2.1'], ['"2016-10-16 00:00:00"', '8.4'], ['"2016-10-17 00:00:00"', '1.1'], ['"2016-10-18 00:00:00"', '3.1'], ['"2016-10-19 00:00:00"', '0'], ['"2016-10-20 00:00:00"', '0'], ['"2016-10-21 00:00:00"', '0'], ['"2016-10-22 00:00:00"', '0'], ['"2016-10-23 00:00:00"', '0'], ['"2016-10-24 00:00:00"', '7.4'], ['"2016-10-25 00:00:00"', '2.1'], ['"2016-10-26 00:00:00"', '0'], ['"2016-10-27 00:00:00"', '0'], ['"2016-10-28 00:00:00"', '0'], ['"2016-10-29 00:00:00"', '0']
Output:
0
0 "DateTime","mm"
1 ["2016-10-14 00:00:00", 1.1]
2 ["2016-10-15 00:00:00", 2.1]
3 ["2016-10-16 00:00:00", 8.4]
4 ["2016-10-17 00:00:00", 1.1]
.. ...
344 ["2017-10-05 00:00:00", 1]
345 ["2017-10-06 00:00:00", 0]
346 ["2017-10-07 00:00:00", 1.1]
347 ["2017-10-08 00:00:00", 0]
348 ["2017-10-09 00:00:00", 0]
[349 rows x 1 columns]
CodePudding user response:
What you want is something like:
pd.DataFrame(data[1:], columns=data[0].split(','))
To get a clean dataframe with datetime
and float
types:
df = (pd.DataFrame(data[1:], columns=data[0].replace('"', '').split(','))
.assign(DateTime=lambda d: d['DateTime'].str.strip('"'))
.astype({'DateTime': 'datetime64', 'mm': 'float'})
)
output:
DateTime mm
0 2016-10-14 1.1
1 2016-10-15 2.1
2 2016-10-16 8.4
3 2016-10-17 1.1
4 2016-10-18 3.1
...
Now it's clean and ready to use for downstream processing. Example:
df.plot(x='DateTime', y='mm')