Returning Value Frequency from Multiple Columns in Pandas Dataframe, python-CodePudding

I'm working with a pandas dataframe that has several columns populated with values from the same group, similar to this:

Name	First Car	Second Car	Third Car	Fourth Car
Tom	VW	Ford	Honda	Audi
Tim	BMW	Honda	Audi	Ford
Sam	Audi	Honda	Honda	Audi
Bill	Ford	Ford	null	Audi
Mark	VW	Ford	Honda	null

and I need to turn it into this:

Make	First Car	Second Car	Third Car	Fourth Car
VW	2	0	0	0
Ford	1	3	0	1
Honda	0	2	3	0
Audi	1	0	1	3
BMW	1	0	0	0

It seems like this might be possible with a multi column groupby, or with crosstab, but I can't quite figure out how. I assume there are some nice tricks with pandas that will do this without resorting to looping through each column (I'm just getting started with pandas)?

Some further context in case it impacts the solution - once I have the information restructured I need to plot it as a stacked bar chart with matplotlib so I can save the visual programmatically using matplotlib's savefig() function.

CodePudding user response：

Select the columns you want and then apply .value_counts to them, eg:

df.filter(regex=f'Car$').apply(pd.value_counts)

This'll give you:

       First Car  Second Car  Third Car  Fourth Car
Audi         1.0         NaN        1.0         3.0
BMW          1.0         NaN        NaN         NaN
Ford         1.0         3.0        NaN         1.0
Honda        NaN         2.0        3.0         NaN
VW           2.0         NaN        NaN         NaN