Home > database >  if row is missing, data == 0. if not missing use default value
if row is missing, data == 0. if not missing use default value

Time:09-21

def compute(tick):
    df = pd.read_csv(f'{tick}.csv')
    a = df.loc['a'].sum()
    b = df.loc['b'].sum()
    c = df.loc['c'].sum()
    d = (a   b) / c
    return d

in some dataframes there is no row 'b', so it returns KeyError. Then I tried following code, but it doesnt work, anyone can help me provide a solution to this problem?

def compute(tick):
    df = pd.read_csv(f'{tick}.csv')
    a = df.loc['a'].sum()
    if df.loc['b'].isnull():
        b == 0
    else:
        b = df.loc['b'].sum()
    c = df.loc['c'].sum()
    d = (a   b) / c
    return d

CodePudding user response:

Try using this

def compute(tick):
    df = pd.read_csv(f'{tick}.csv')
    
    if b in df.columns: #Check if column b exists.
        b = df.loc['b'].sum()
    else:
        b = 0

    a = df.loc['a'].sum()
    c = df.loc['c'].sum()
    d = (a   b) / c
    return d

CodePudding user response:

Use DataFrame.reindex for add not exist values a or b or c rows, also in df are only 3 filtered rows only:

def compute(tick):
    df = pd.read_csv(f'{tick}.csv').reindex(['a','b','c'], fill_value=0)
    a = df.loc['a'].sum()
    b = df.loc['b'].sum()
    c = df.loc['c'].sum()
    d = (a   b) / c
    return d

If use same function 3 times is possible use sum per axis=1:

def compute(tick):
    df = pd.read_csv(f'{tick}.csv').reindex(['a','b','c'], fill_value=0)
    abc = df.sum(axis=1)

    a = abc.loc['a']
    b = abc.loc['b']
    c = abc.loc['c']
    d = (a   b) / c
    return d

CodePudding user response:

How about:-

def compute(tick):
    df = pd.read_csv(f'{tick}.csv')
    a = df.loc['a'].sum()
    c = df.loc['c'].sum()
    try:
       return(a   df.loc['b'].sum()) / c
    except KeyError:
       pass
    return a / c
  • Related