Home > other >  Calculate percentage of interview participants who had education background
Calculate percentage of interview participants who had education background

Time:12-19

I am really sorry if this question was already asked. I have tried to search different answers but havent found one related to mine.

I have a large dataframe with data looking like this :

import pandas as pd
  
# intialise data of lists.
data = {'interview_key':['00-60-62-69', '00-80-63-65', '00-81-80-59', '00-87-72-75'],
        'any_education':['YES', 'YES', 'NO', 'NAN']}
  
# Create DataFrame
df = pd.DataFrame(data)
  
# Print the output.
df

This data represents a group of people who were interviewed and they agreed to have any education represented by YES or didnt have education at all represented by NO.

I want to do a simple task and that is to find percentage of people who had any form of education. in simple terms those who said YES to having any education.

How can this be done.

CodePudding user response:

I guess that should be like this

import pandas as pd

data = {'interview_key':['00-60-62-69', '00-80-63-65', '00-81-80-59', '00-87-72-75'],
        'any_education':['YES', 'YES', 'NO', 'NAN']}

df = pd.DataFrame(data)

# Count the number of "YES" and "NO" values in the any_education column
counts = df['any_education'].value_counts()

# Calculate the percentage of people who had any form of education
percentage = (counts['YES'] / (counts['YES']   counts['NO'])) * 100

print(f'Percentage of people with any form of education: {percentage:.2f}%')

CodePudding user response:

df['any_education'].value_counts(normalize=True)

YES    0.50
NO     0.25
NaN    0.25
Name: any_education, dtype: float64

CodePudding user response:

Try this

import pandas as pd

# initialise data of lists.
data = {'interview_key': ['00-60-62-69', '00-80-63-65', '00-81-80-59', '00-87-72-75'],
        'any_education': ['YES', 'YES', 'NO', 'NAN']}

# Create DataFrame
df = pd.DataFrame(data)

# Calculate percentage
total_yes = df['any_education'].value_counts()['YES']
total_rows = len(df.axes[0])
percentage = total_yes / total_rows * 100

# print the output
print(f"{percentage = }%")

Output:

percentage = 50.0%
  • Related