Hoping someone can help me here - i believe i am close to the solution.
I have a dataframe, of which i have am using .count() in order to return a series of all column names of my dataframe, and each of their respective non-NAN value counts.
Example dataframe:
feature_1 | feature_2 |
---|---|
1 | 1 |
2 | NaN |
3 | 2 |
4 | NaN |
5 | 3 |
Example result for .count() here would output a series that looks like:
feature_1 5
feature_2 3
I am now trying to get this data into a dataframe, with the column names "Feature" and "Count". To have the expected output look like this:
Feature | Count |
---|---|
feature_1 | 5 |
feature_2 | 3 |
I am using .to_frame() to push the series to a dataframe in order to add column names. Full code:
df = data.count()
df = df.to_frame()
df.columns = ['Feature', 'Count']
However receiving this error message - "ValueError: Length mismatch: Expected axis has 1 elements, new values have 2 elements", as if though it is not recognising the actual column names (Feature) as a column with values.
How can i get it to recognise both Feature and Count columns to be able to add column names to them?
CodePudding user response:
Add Series.reset_index
instead Series.to_frame
for 2 columns DataFrame - first column from index, second from values of Series:
df = data.count().reset_index()
df.columns = ['Feature', 'Count']
print (df)
Feature Count
0 feature_1 5
1 feature_2 3
Another solution with name
parameter and Series.rename_axis
or with DataFrame.set_axis
:
df = data.count().rename_axis('Feature').reset_index(name='Count')
#alternative
df = data.count().reset_index().set_axis(['Feature', 'Count'], axis=1)
print (df)
Feature Count
0 feature_1 5
1 feature_2 3
CodePudding user response:
This happens because your new dataframe has only one column (the column name is taken as series index, then translated into dataframe index with the func to_frame()
). In order to assign a 2 elements list to df.columns
you have to reset the index first:
df = data.count()
df = df.to_frame().reset_index()
df.columns = ['Feature', 'Count']