I have a data frame and a dictionary like this:
df:
ID Science Social
1 12 24
2 NaN 13
3 26 NaN
4 23 35
count_dict = {Science:30, Social: 40}
For every course column in the data frame, I want to create 2 new columns such that:
Col-1(Course_Count): If the course column is not null, then the new column gets the value from the dictionary, else it will remain Null.
Col-2(Course_%): Course/Course_Count
The output looks like this:
df:
ID Science Science_Count Science_% Social Social_Count Social_%
1 12 30 12/30 24 40 24/40
2 NaN 13 40 13/40
3 26 30 26/30 NaN
4 23 30 23/30 35 40 35/40
Can anyone help me with this?
CodePudding user response:
try this:
column_name=list(df.columns)
for column in column_name:
df[f"{column}_Count"]=df.apply(lambda x:count_dict[column] if x==None else None,axis=1)
CodePudding user response:
If not any column in your dataframe is a course column, you can specify only the course column names in the courses
list. Now I am just skipping the first column there ('ID'):
courses = df.columns[1:]
order = ['ID'] [col for course in courses for col in (course, course '_Count', course '_%')]
for course in courses:
df[course '_Count'] = count_dict[course]
df.loc[df[course].isna(), course '_Count'] = np.nan
df[course '_%'] = df[course] / df[course '_Count']
df = df[order] # reorder the columns
Result:
ID Science Science_Count Science_% Social Social_Count Social_%
0 1 12.0 30.0 0.400000 24.0 40.0 0.600
1 2 NaN NaN NaN 13.0 40.0 0.325
2 3 26.0 30.0 0.866667 NaN NaN NaN
3 4 23.0 30.0 0.766667 35.0 40.0 0.875