Home > Mobile >  Shift rows with missing NaN's to it's own column
Shift rows with missing NaN's to it's own column

Time:08-30

I am parsing a lot of netstat data and the way I have been handling my solution now is by just removing the row and referencing manually. If I see proto is NaN, I just parse the row completely. But I am unable to append the row and the rest of the dataframe due to mismatched sizing.

I was wondering if it would be possible to just take the row with empty columns and just move it to the preceding row appending it to a column.

E.g - This is what my dataframe looks like as of now

Proto LocalAddress ForeignAdress State PID
TCP [0.0.0.0:7] 0.0.0.0:0 LISTENING 4112
[tcpsvcs.exe]
TCP 0.0.0.0:111 0.0.0.0:0 LISTENING 4
Can not obtain ownership information

Which will hopefully turn into

E.g

Proto LocalAddress ForeignAdress State PID Process_name
TCP [0.0.0.0:7] 0.0.0.0:0 LISTENING 4112 tcpsvcs.exe
TCP 0.0.0.0:111 0.0.0.0:0 LISTENING 4 Can not obtain ownership information

Basically create a new column for the process names and keep appending to the prior line.

CodePudding user response:

Try this:

You said it is always in the next row, so we just need to get a Series of Proto which only contains the values of the rows with NaN. Then we just shift it by 1 and create a new column with it.

cols = ['LocalAddress', 'ForeignAdress', 'State', 'PID']
df['process_name'] = df[df[cols].isna().all(axis=1)]['Proto'].reindex_like(df).shift(-1)
df = df.dropna(subset=cols)

Output:

  Proto LocalAddress ForeignAdress      State     PID                          process_name
0   TCP  [0.0.0.0:7]     0.0.0.0:0  LISTENING  4112.0                         [tcpsvcs.exe]
2   TCP  0.0.0.0:111     0.0.0.0:0  LISTENING     4.0  Can not obtain ownership information

CodePudding user response:

Something like this should work:

mask = df['LocalAddress'].isna()
missing_vals = df.loc[mask, 'Proto'].values
df = df[~mask].copy()
df['Process_name'] = missing_vals

CodePudding user response:

First, start by adding the column "Process_name" and use enter image description here

  • Related