I am creating a dataframe that looks at the column 'Study Title' and returns a text category in a new column 'Categories' if the 'Study Title' column contains specific text in the string. For example, if the study title contains the text 'Child Care' or 'Head Start' then it will return a value if true in the new 'Categories' column with 'Child Care' for 'Child Care' and 'School Readiness' for 'Head Start', etc. How can I add multiple conditions with more values if true so that I have a list of strings to look for that returns a value I assign?
This is an example of what I would like the output to look like where the column 'Categories' returns the value 'Child Care' if the title contains the words 'Child Care'.
I have more conditions I want to include for other title considerations like 'School Readiness' for 'Head Start'. So far what I have only returns a true/false value based on the one condition for 'Child Care'.
This is the code I have so far:
df['Categories'] = df['Study Title'].str.contains(r'Child Care', na=True)
df
Copy-Paste CSV Sample
Study Title,URL,Funding Agency,Category
"American Indian and Alaska Native Head Start Family and Child Experiences Survey, 2015",https://www.icpsr.umich.edu/web/ICPSR/studies/36804,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
American Indian and Alaska Native Head Start Family and Child Experiences Survey 2019 (AIAN FACES 2019),https://www.icpsr.umich.edu/web/ICPSR/studies/38028,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
"Carolina Abecedarian Project (ABC) and the Carolina Approach to Responsive Education (CARE), Age 21 Follow Up Study, 1993 - 2003",https://www.icpsr.umich.edu/web/ICPSR/studies/32262,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
"Child Care and Development Fund (CCDF) Policies Database, 2009",https://www.icpsr.umich.edu/web/ICPSR/studies/32261,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
"Child Care and Development Fund (CCDF) Policies Database, 2011",https://www.icpsr.umich.edu/web/ICPSR/studies/34390,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
"Child Care and Development Fund (CCDF) Policies Database, 2012",https://www.icpsr.umich.edu/web/ICPSR/studies/34902,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
"Child Care and Development Fund (CCDF) Policies Database, 2013",https://www.icpsr.umich.edu/web/ICPSR/studies/35482,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
"Child Care and Development Fund (CCDF) Policies Database, 2014",https://www.icpsr.umich.edu/web/ICPSR/studies/36276,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
"Child Care and Development Fund (CCDF) Policies Database, 2015",https://www.icpsr.umich.edu/web/ICPSR/studies/36581,United States Department of Health and Human Services. Administration for Children and Families. Office of Planning Research and Evaluation,
CodePudding user response:
What you're trying to do is essentially a mapping, which a basic membership test is not suitable for. You need to define a mapping function:
In [8]: def determine_category(title):
...: if "Child Care" in title:
...: return "Child"
...: elif "Head Start" in title:
...: return "School Readiness"
...: return "no matching category"
...:
In [9]: df["Study Title"].apply(determine_category)
Out[9]:
0 School Readiness
1 School Readiness
2 no matching category
3 Child
4 Child
5 Child
6 Child
7 Child
8 Child
Name: Study Title, dtype: object