Home > Software engineering >  How to concat Pandas series and DataFrame
How to concat Pandas series and DataFrame

Time:05-03

I would like to concat the Pandas Series into my Dataframe. My code works fine with Pandas append function, however, it seems that append will not be used in the future and I am not able to figure out how to solve it with pd.concat.

Deprecated since version 1.4.0: Use concat() instead. For further details see Deprecated DataFrame.append and Series.append

Here is my code snippet:

>>> history_dataframe = pd.DataFrame(columns=['Download_date', 'Name', 'Status'])
>>> history_dataframe
Out:
Download_date Name Status

>>> save_series = pd.Series([str(datetime.datetime.now())[:-7], file, 
                            'complete'])
>>> save_series
Out:
0  2022-05-02 10:37:28
1  testfile.txt
2  complete

>>> history_dataframe = pd.concat([history_dataframe,
                                  save_series], ignore_index =True)
>>> history_dataframe
Out:
   Download_date Name Status
 0           Nan  NaN    Nan 2022-05-02 10:37:27
 1           Nan  NaN    Nan testfile.txt
 2           Nan  NaN    Nan Complete

With index = history_dataframe.columns inside pd.Series, it shows ValueError: Length of values (3) does not match length of index (4).

Any suggestions? Thx

CodePudding user response:

There are 3 values in Series, so index has list with ['Download_date', 'Name', 'Status']:

save_series = pd.Series([str(datetime.datetime.now())[:-7],
                         file, 
                        'complete'], index=['Download_date', 'Name', 'Status'])

And then for concat create one row DataFrame by Series.to_frame and transpose:

history_dataframe = pd.concat([history_dataframe, save_series.to_frame().T], ignore_index =True)

Better is create list of Series and then only once pass to DataFrame constructor:

file = 'a.txt'
save_series = pd.Series([str(datetime.datetime.now())[:-7],
                             file, 
                            'complete'], index=['Download_date', 'Name', 'Status'])
    
print (save_series)

L = []
#sample loop
for i in range(3):
    L.append(save_series)


df = pd.DataFrame(L)
print (df)
         Download_date   Name    Status
0  2022-05-02 10:57:11  a.txt  complete
1  2022-05-02 10:57:11  a.txt  complete
2  2022-05-02 10:57:11  a.txt  complete

CodePudding user response:

You would need to convert your Series to DataFrame and transpose it, which is probably not really needed.

Why not just using loc in this case?

history_dataframe.loc[0] = save_series

output:

         Download_date          Name    Status
0  2022-05-02 10:51:08  testfile.txt  complete

input Series:

save_series = pd.Series([str(datetime.datetime.now())[:-7], file, 'complete'],
                        index=history_dataframe.columns)

many Series

Collect the data and concat in the end:

l = [save_series, save_series]  # create this in your loop or however you want
# then concat once
df = pd.concat(l, axis=1).T

example output:

         Download_date          Name    Status
0  2022-05-02 10:51:08  testfile.txt  complete
1  2022-05-02 10:51:08  testfile.txt  complete

CodePudding user response:

Basically if you create a pandas Dataframe it will create an index at the first column that's why you can't concatenate your pandas series.

Try recreating your pandas Dataframe and concatenate again.

history_dataframe = pd.DataFrame(columns=['Download_date', 'Name', 'Status'], ignore_index=True)
  • Related