Home > Blockchain >  How to auto create a new column after value_counts in Panda?
How to auto create a new column after value_counts in Panda?

Time:10-26

This is the data

Price Rating Food Rating Service Rating
1 1 2
4 5 5
3 4 3
5 5 1
1 1 5

The ratings only ranges from 1-5. Is there a way to do a value count of the ratings table and auto create a new column to group the counts?

Adding a new column manually after the value_counts is a possibility but there has to be a more graceful way?

In other words, how to change the table above to the table below?:

Price Rating Count Food Rating Count Service Rating Count Rating
2 2 1 1
0 0 1 2
1 0 1 3
1 1 0 4
1 2 2 5

CodePudding user response:

melt then crosstab

s = df.melt()
out = pd.crosstab(s.value,s.variable)
out
Out[228]: 
variable  Food Rating  Price Rating  Service Rating
value                                              
1                   2             2               1
2                   0             0               1
3                   0             1               1
4                   1             1               0
5                   2             1               2

CodePudding user response:

Beny's answer is great, and using SQLite can be another option, basically, the union all part is melt, the case when part is crosstab`:

    df2 = pd.DataFrame(
        {
            "Price Rating": [1, 4, 3, 5, 1],
            "Food Rating": [1, 4, 4, 5, 1],
            "Service Rating": [2, 5, 3, 5, 1]
        }
    )
    con = sqlite3.connect('data.sqlite')
    cur = con.cursor()
    df2.to_sql("rating", con, if_exists="replace")
    con.execute('''create table result as SELECT count(CASE WHEN variable = \'Price Rating\' THEN variable END) "Price Rating Count",
       count(CASE WHEN variable = \'Food Rating\' THEN variable END) "Food Rating Count",
       count(CASE WHEN variable = \'Service Rating\' THEN variable END) "Service Rating Count",
       val "Rating"
  FROM (
           SELECT \'Price Rating\' AS variable,
                  "Price Rating" AS val
             FROM rating
           UNION ALL
           SELECT \'Food Rating\' AS variable,
                  "Food Rating" AS val
             FROM rating
           UNION ALL
           SELECT \'Service Rating\' AS variable,
                  "Service Rating" AS val
             FROM rating
       )
 GROUP BY val;''')
    output_df = pd.read_sql_query("SELECT * from result", con)
    print (output_df)
    con.commit()
    con.close()
   Price Rating Count  Food Rating Count  Service Rating Count  Rating
0                   2                  2                     1       1
1                   0                  0                     1       2
2                   1                  0                     1       3
3                   1                  2                     0       4
4                   1                  1                     2       5
  • Related