I am trying to add a new column "profile_type" to a dataframe "df_new" which contains the string "Decision Maker" if the "job_title" has any one of the following words: (Head or VP or COO or CEO or CMO or CLO or Chief or Partner or Founder or Owner or CIO or CTO or President or Leaders),
"Key Influencer" if the "job_title" has any one of the following words: (Senior or Consultant or Manager or Learning or Training or Talent or HR or Human Resources or Consultant or L&D or Lead), and
"Influencer" for all other fields in "job_title".
For example, if the 'job_title' includes a row "Learning and Development Specialist", the code has to pull out just the word 'Learning' and segregate it as 'Key Influencer' under 'profile_type'.
CodePudding user response:
I would try something like this:
import numpy as np
dm_titles = ['Head', 'VP', 'COO', ...]
ki_titles = ['Senior ', 'Consultant', 'Manager', ...]
conditions = [
(any([word in new_df['job_title'] for word in dm_titles])),
(any([word in new_df['job_title'] for word in ki_titles])),
(all([word not in new_df['job_title'] for word in dm_titles] [word not in new_df['job_title'] for word in ki_titles]))
]
values = ["Decision Maker", "Key Influencer", "Influencer"]
df_new['profile_type'] = np.select(conditions, values)
Let me know if you need any clarification!
CodePudding user response:
First, define a function that acts on a row of the dataframe, and returns what you want: in your case, 'Decision Maker'
if the job_title
contains any words in your list.
def is_key_worker(row):
if (row["job_title"] == "CTO" or row["job_title"]=="Founder") # add more here.
Next, apply the function to your dataframe, along axis 1.
df_new["Key influencer"] = df_new.apply(is_key_worker, axis=1)