How to count number of columns which are not empty-CodePudding

I have a data frame that looks like the following one:

I need to add columns that will include the following:

Number of job{i}_start_year columns which are not empty at the end of each row
Number of edu{i}_start_year columns which are not empty at the end of each row

Any help would be greatly appreciated.

CodePudding user response：

Use DataFrame.assign with filter columns names by DataFrame.filter, if necessary replace empty strigns to missing values, so possible forward filling missing values per rows by ffill and select last column by position in DataFrame.iloc:

df.assign(job = df.filter(regex='job\d').replace('',np.nan).ffill(axis=1).iloc[:, -1],
          edu = df.filter(regex='edu\d').replace('',np.nan).ffill(axis=1).iloc[:, -1])

CodePudding user response：

As I understand it you want extra columns at the end of each row with the number of non-empty years in the job and edu categories. Try:

job_columns= ['job1_start_year','job2_start_year','job3_start_year']
edu_columns = ['edu1_start_year', 'edu2_start_year','edu3_start_year']

df['job_count'] = df[job_columns].ne('').sum(axis = 1).values
df['edu_count'] = df[edu_columns].ne('').sum(axis = 1).values