Home > Software design >  How to fill in the blanks for one (or more) columns by calculating based on multiple conditions?
How to fill in the blanks for one (or more) columns by calculating based on multiple conditions?

Time:05-15

I collect and process hundreds of jumbled tables every day, and some of the data requires attention.

sample data

(Sorry for my English and typography skills)

The item with [LEVEL]"S" rarely appears, and when it happens(row 3,6), if the ["Check"] column is "Go"(row 6), the blank of the target column needs to be filled in. (If not empty, the original value is retained)

  1. The blank in the ["FillNA"] column needs to be filled with "DANGER".

(2. If possible, the blank in the ["Detail"] column needs to be filled with the row number of the dataframe && the action indicated in the ["Check"] column.)

N = pd.NA  
df= pd.DataFrame({  "LEVEL":["E","D","C","S","A","B","S","A","B","C"],
                    "CHECK":["HELLO",N ,N,"Keep",N,N,"Go",N,N,N],
                    "FillNA":["WORLD",N ,N,"Data",N,N,N,N,N,N],
                    "Detail":[N,N,N,N,N,N,N,N,N,N] })
print(df)

CodePudding user response:

IUUC,

m = df['LEVEL'].eq('S') & df['CHECK'].eq('Go')
df['FillNA'] = df['FillNA'].mask(m & df['FillNA'].isna(), 'DANGER')
df['Detail'] = df['Detail'].mask(m & df['Detail'].isna(), df.index.astype(str)   ' '   df['CHECK'])

# or

df.loc[m, cols].fillna({'FillNA': 'DANGER', 'Detail': df.index.astype(str)   ' '   df['CHECK']}, inplace=True)
print(df)

  LEVEL  CHECK  FillNA Detail
0     E  HELLO   WORLD   <NA>
1     D   <NA>    <NA>   <NA>
2     C   <NA>    <NA>   <NA>
3     S   Keep    Data   <NA>
4     A   <NA>    <NA>   <NA>
5     B   <NA>    <NA>   <NA>
6     S     Go  DANGER   6 Go
7     A   <NA>    <NA>   <NA>
8     B   <NA>    <NA>   <NA>
9     C   <NA>    <NA>   <NA>
  • Related