Home > OS >  Converting list to DataFrame how to remove leading 0 in the first row
Converting list to DataFrame how to remove leading 0 in the first row

Time:11-09

I am trying to use Panda to convert a list to a DataFrame. Every time I am trying to conver the list to a DataFrame I get a first row with 0 and it does not work correctly?

Code:

import pandas as pd

df = pd.DataFrame(list(data))
print(df)

Input:

['"DateTime","mm"', ['"2016-10-14 00:00:00"', '1.1'], ['"2016-10-15 00:00:00"', '2.1'], ['"2016-10-16 00:00:00"', '8.4'], ['"2016-10-17 00:00:00"', '1.1'], ['"2016-10-18 00:00:00"', '3.1'], ['"2016-10-19 00:00:00"', '0'], ['"2016-10-20 00:00:00"', '0'], ['"2016-10-21 00:00:00"', '0'], ['"2016-10-22 00:00:00"', '0'], ['"2016-10-23 00:00:00"', '0'], ['"2016-10-24 00:00:00"', '7.4'], ['"2016-10-25 00:00:00"', '2.1'], ['"2016-10-26 00:00:00"', '0'], ['"2016-10-27 00:00:00"', '0'], ['"2016-10-28 00:00:00"', '0'], ['"2016-10-29 00:00:00"', '0']

Output:

                                0
0                 "DateTime","mm"
1    ["2016-10-14 00:00:00", 1.1]
2    ["2016-10-15 00:00:00", 2.1]
3    ["2016-10-16 00:00:00", 8.4]
4    ["2016-10-17 00:00:00", 1.1]
..                            ...
344    ["2017-10-05 00:00:00", 1]
345    ["2017-10-06 00:00:00", 0]
346  ["2017-10-07 00:00:00", 1.1]
347    ["2017-10-08 00:00:00", 0]
348    ["2017-10-09 00:00:00", 0]

[349 rows x 1 columns]

CodePudding user response:

What you want is something like:

pd.DataFrame(data[1:], columns=data[0].split(','))

To get a clean dataframe with datetime and float types:

df = (pd.DataFrame(data[1:], columns=data[0].replace('"', '').split(','))
        .assign(DateTime=lambda d: d['DateTime'].str.strip('"'))
        .astype({'DateTime': 'datetime64', 'mm': 'float'})
)

output:

    DateTime   mm
0 2016-10-14  1.1
1 2016-10-15  2.1
2 2016-10-16  8.4
3 2016-10-17  1.1
4 2016-10-18  3.1
...

Now it's clean and ready to use for downstream processing. Example:

df.plot(x='DateTime', y='mm')

plot

  • Related