I have a column that looks like this:
group
A
A
A
B
B
C
The value C exists sometimes but not always. This works fine when the C is present. However, if C does not occur in the column, it throws a key error.
value_counts = df.group.value_counts()
new_df["C"] = value_counts.C
I want to check whether C has a count or not. If not, I want to assign new_df["C"]
a value of 0. I tried this but i still get a keyerror. What else can I try?
value_counts = df.group.value_counts()
new_df["C"] = value_counts.C
if (df.group.value_counts()['consents']):
new_df["C"] = value_counts.consents
else:
new_df["C"] = 0
CodePudding user response:
One way of doing it is by converting series into dictionary and getting the key, unless not found return the default value (in your case it is 0):
df = pd.DataFrame({'group': ['A', 'A', 'B', 'B', 'D']})
new_df = {}
character = "C"
new_df[character] = df.group.value_counts().to_dict().get(character, 0)
output of new_df
{'C': 0}
However, I am not sure what new_df
should be, it seems that it is a dictionary? Or it might be a new dataframe object?
CodePudding user response:
One way could be to convert the group
column to Categorical
type with specified categories. eg:
df = pd.DataFrame({'group': ['A', 'A', 'A', 'B', 'B']})
print(df)
# group
# 0 A
# 1 A
# 2 A
# 3 B
# 4 B
categories = ['A', 'B', 'C']
df['group'] = pd.Categorical(df['group'], categories=categories)
df['group'].value_counts()
[out]
A 3
B 2
C 0
Name: group, dtype: int64