Home > other >  Is there a method to save not missing values in another data frame?
Is there a method to save not missing values in another data frame?

Time:06-06

I have a data frame by 20441 rows and 158 columns. in each row, there are a lot of "NA" values. so I want to convert it to sth like this:

If a value is not NA, I save it's row name , column name and value in another data frame. for example my first data frame is :

row and column name c1 c2 c3 c4 c5
r1 NA NA NA 5 6
r2 1 3 NA NA NA
row name c1 c2
r1 c4 5
r1 c5 6
r2 c1 1
r2 c2 3

CodePudding user response:

Use melt:

out = (df.melt('row and column name', var_name='C1', value_name='C2')
         .dropna().astype({'C2': int}))

Output:

row and column name C1 C2
r2 c1 1
r2 c2 3
r1 c4 5
r1 c5 6

CodePudding user response:

The answer using df.melt works but is slower than the below code. I used %%timeit to measure the time of each code.

df.melt takes 2.47 ms ± 93.5 and below code takes 314 µs ± 10.4

if performance is not important then df.melt is better since it is a one line code.

row = []
cols = []
val = []
for col in data.columns[1:]: 
    for i,e in enumerate(data[col]): 
        if pd.isna(e) == False:
            row.append(data['row&col name'][i])
            cols.append(col)
            val.append(e)  
new_data = pd.DataFrame(list(zip(row,cols,val)),columns=['row','col','val'])

  • Related