Home > database >  How can I find a value in a row that is not a NaN value?
How can I find a value in a row that is not a NaN value?

Time:01-04

I know the annual price for four different apples.

Unfortunately, there are years when some apples are missing in price.

I would like to know the unit price for the first year in which the unit price was entered except for the missing year.

The code is as below.

import pandas as pd
import numpy as np

a = {'Price_Y19':[np.nan,np.nan,np.nan,10],
     'Price_Y20':[np.nan,np.nan,10,9],
     'Price_Y21':[np.nan,10,9,8],
     'Price_Y22':[10,9,8,7]}

index_name = ['yellow apple','red apple','white apple','gray apple']

df = pd.DataFrame(data = a,
                  index = index_name)
df

I would like to get below DataFrame

b = {'Price_Y19':[np.nan,np.nan,np.nan,10],
     'Price_Y20':[np.nan,np.nan,10,9],
     'Price_Y21':[np.nan,10,9,8],
     'Price_Y22':[10,9,8,7],
     'Price_Initial':[10,10,10,10],
     'Price_Final':[10,9,8,7],
     'Price_Gap':[0,1,2,3]}

df1 = pd.DataFrame(data = b,
                  index = index_name)
df1

I always wait your answer. Thanks

I don't have any ideas how to make code.

df['Price_Initial] = some mehode

CodePudding user response:

You can backward or forward filling missing values with select first or last column by position for Initial/Final prices, last subtract new columns:

df = df.assign(Price_Initial = df.bfill(axis=1).iloc[:, 0],
               Price_Final = df.ffill(axis=1).iloc[:, -1],
               Price_Gap = lambda x: x['Price_Initial'].sub(x['Price_Final']))

print (df)
              Price_Y19  Price_Y20  Price_Y21  Price_Y22  Price_Initial  \
yellow apple        NaN        NaN        NaN         10           10.0   
red apple           NaN        NaN       10.0          9           10.0   
white apple         NaN       10.0        9.0          8           10.0   
gray apple         10.0        9.0        8.0          7           10.0   

              Price_Final  Price_Gap  
yellow apple         10.0        0.0  
red apple             9.0        1.0  
white apple           8.0        2.0  
gray apple            7.0        3.0 

If multiple columns and need filter only Price_YY columns:

df1 = df.filter(regex='Price_Y\d{2}')
df = df.assign(Price_Initial = df1.bfill(axis=1).iloc[:, 0],
               Price_Final = df1.ffill(axis=1).iloc[:, -1],
               Price_Gap = lambda x: x['Price_Initial'].sub(x['Price_Final']))

print (df)
              Price_Y19  Price_Y20  Price_Y21  Price_Y22  Price_Initial  \
yellow apple        NaN        NaN        NaN         10           10.0   
red apple           NaN        NaN       10.0          9           10.0   
white apple         NaN       10.0        9.0          8           10.0   
gray apple         10.0        9.0        8.0          7           10.0   

              Price_Final  Price_Gap  
yellow apple         10.0        0.0  
red apple             9.0        1.0  
white apple           8.0        2.0  
gray apple            7.0        3.0  

CodePudding user response:

With pandas.Series.first_valid_index and pandas.Series.last_valid_index functions:

df['Price_Initial'], df['Price_Final'] = df.apply(lambda x: (x[x.first_valid_index()], x[x.last_valid_index()])).values
df['Price_Gap'] = df['Price_Initial'] - df['Price_Final']
print(df)

              Price_Y19  Price_Y20  Price_Y21  Price_Y22  Price_Initial  \
yellow apple        NaN        NaN        NaN         10           10.0   
red apple           NaN        NaN       10.0          9           10.0   
white apple         NaN       10.0        9.0          8           10.0   
gray apple         10.0        9.0        8.0          7           10.0   

              Price_Final  Price_Gap  
yellow apple         10.0        0.0  
red apple             9.0        1.0  
white apple           8.0        2.0  
gray apple            7.0        3.0  
  • Related