Home > Software design >  convert dataframe column values into list and make a new column from it which contains list of eleme
convert dataframe column values into list and make a new column from it which contains list of eleme

Time:11-10

I have these values in dataset in a pandas dataframe column

col1

0.74
0.77
0.72
0.65
0.24
0.07
0.21
0.05
0.09

I want to get a new column of six elements as list in new columns as rows (by shifting one values at a time in list)

This is the col that I want to get.

col2

[0.74,0.77,0.72,0.65,0.24,0.07]
[0.77,0.72,0.65,0.24,0.07,0.21]
[0.72,0.65,0.24,0.07,0.21,0.05]
[0.65,0.24,0.07,0.21,0.05,0.09]
[0.24,0.07,0.21,0.05,0.09,NaN]
[0.07,0.21,0.05,0.09,NaN,NaN]

CodePudding user response:

Using the data you have given me i came up with this solution

import pandas as pd
import numpy as np
df = pd.DataFrame({'col1': [0.74,0.77,0.72,0.65,0.24,0.07,0.21,0.05,0.09]})
df["col2"] = ""
for i in range(len(df)):
    lst = df["col1"].iloc[i:i 6].to_list()
    length = len(lst)
# Only if you need the list to be the same length
    while length<6:
        lst.append(np.nan)
        length  =1
    print(lst)
    df.at[i, 'col2'] = lst

Unsure if there is a faster way of doing it using list comprehension

CodePudding user response:

I would use numpy's sliding_window_view:

import numpy as np
from numpy.lib.stride_tricks import sliding_window_view as swv

window = 6
extra  = window-1

df['col2'] = swv(np.pad(df['col1'], (0, extra), constant_values=np.nan),
                 window).tolist()

output:

   col1                                  col2
0  0.74  [0.74, 0.77, 0.72, 0.65, 0.24, 0.07]
1  0.77  [0.77, 0.72, 0.65, 0.24, 0.07, 0.21]
2  0.72  [0.72, 0.65, 0.24, 0.07, 0.21, 0.05]
3  0.65  [0.65, 0.24, 0.07, 0.21, 0.05, 0.09]
4  0.24  [0.24, 0.07, 0.21, 0.05, 0.09,  nan]
5  0.07  [0.07, 0.21, 0.05, 0.09,  nan,  nan]
6  0.21  [0.21, 0.05, 0.09,  nan,  nan,  nan]
7  0.05  [0.05, 0.09,  nan,  nan,  nan,  nan]
8  0.09  [0.09,  nan,  nan,  nan,  nan,  nan]

CodePudding user response:

s = pd.Series([0.74,
0.77,
0.72,
0.65,
0.24,
0.07,
0.21,
0.05,
0.09])


arr = []
index = 6
for i in range(index):
    arr.append(s.values[i:index i])
print (arr)

output:

[array([0.74, 0.77, 0.72, 0.65, 0.24, 0.07]), array([0.77, 0.72, 0.65, 0.24, 0.07, 0.21]), array([0.72, 0.65, 0.24, 0.07, 0.21, 0.05]), array([0.65, 0.24, 0.07, 0.21, 0.05, 0.09]), array([0.24, 0.07, 0.21, 0.05, 0.09]), array([0.07, 0.21, 0.05, 0.09])]
  • Related