Home > Software engineering >  create new columns where values of row is NaN
create new columns where values of row is NaN

Time:06-02

I have a column of data with rows where NaN exists (see image). I intend splitting it where values are NaN and create new columns where a value emerges after NaN. For instance, I intend to create a new column at row 7 and subsequent rows where succeeding NaN values in the column. I have tried this but it congests the data together.

Col1
0   Start
1   65
2   oft
3   23:59:02
4   12-Feb-99
5   NaN
6   NaN
7   17
8   Sparkle
9   10

I have used the code below to break them into groups. df['group_no'] = (df.Column1.isnull()).cumsum()

Col1           groups
0   Start      0
1   65         0
2   oft        0
3   23:59:02   0
4   12-Feb-99. 0
5   NaN        1
6   NaN        2
7   17         2
8   Sparkle    2
9   10         2

I now intend to stack the the data into different columns based on the groups numbers

Col1              Col2    Col3   ...   ColN
0   Start         NaN     Nan           ...
1   65                    17            ....
2   oft                   Sparkle       ....
3   23:59:02              10            ...
4   12-Feb-99

CodePudding user response:

I suggest slicing pandas dataframe manually instead of using numpy to slice.

# Get index of Null values
index = df.index[df.col.isna()].to_list()

starting_index = [0]   [i   1 for i in index]
ending_index = [i - 1 for i in index]   [len(df) - 1]

n = 0

for i, j in zip(starting_index, ending_index):
    if i <= j:
        n  = 1
        df[f"col{n}"] = np.nan
        df.loc[: j - i, f"col{n}"] = df.loc[i:j, "col"].values
  • Related