Home > Enterprise >  Make a grouped boxplot with the recurring/grouped rows in a dataframe in R
Make a grouped boxplot with the recurring/grouped rows in a dataframe in R

Time:11-04

I would like to make a grouped boxplot like this Grouped Boxplot GGPLOT2

with elements of a data frame of three columns, ID, an issuer_bank and a risk score for every transaction.

My starting data frame look like this:

ID issuer_name risk_score
1 POSTEPAY 0
2 BANCA SELLA 10
3 BANCA SELLA 10
4 BANCA SELLA 10
5 BANCA SELLA 33
6 POSTEPAY 0
7 POSTEPAY 0
8 POSTEPAY 10
9 CASSA CENTRALE BANCA - CREDITO 0
10 BANCA NAZIONALE DEL LAVORO SP 0
11 BANCO BPM SOCIETA PER AZIONI 0
12 NEXI PAYMENTS 10
13 UNICREDIT 10
14 CREDITO EMILIANO 0
15 POSTEPAY 0
16 POSTEPAY 10
17 POSTEPAY 0
18 N26 GMBH 10
19 BANCO BPM SOCIETA PER AZIONI 25
20 NEXI PAYMENTS 10
21 POSTEPAY 10
22 INTESA SANPAOLO 0
23 CREDIT AGRICOLE CARIPARMA 10
24 CREDIT AGRICOLE CARIPARMA 10
25 FINECOBANK 0
26 UNICREDIT 44
27 BANCA MONTE DEI PASCHI DI SIENA 0
28 UNICREDIT 0
29 FINECOBANK 25
30 CASSA CENTRALE BANCA - CREDITO 10
31 NEXI PAYMENTS 10
32 INTESA SANPAOLO 25
33 INTESA SANPAOLO 0
34 POSTEPAY 10
35 INTESA SANPAOLO 10
36 INTESA SANPAOLO 20
37 POSTEPAY 10
38 INTESA SANPAOLO 10
39 POSTEPAY 0
40 POSTEPAY 0
41 UNICREDIT 10
42 POSTEPAY 10
43 BIBANCA 0
44 POSTEPAY 0
45 INTESA SANPAOLO 10
46 FINDOMESTIC BANCA 10
47 POSTEPAY 0
48 POSTEPAY 10

I would like to group for Issuer_name and make a boxplot of the risk score and compute all in a single chart.

The manual process that I have done is the following

df_risk_score_single <- df_risk_score %>% filter(issuer_name == "POSTEPAY")
ggplot(data = df_risk_score_single, aes(x= issuer_name, y=risk_score)) geom_boxplot() 

which return a single boxplot the single boxplot

any suggestion?

CodePudding user response:

I think you are looking for something like the following code snippet, which results in the graph shown below.

I used the sample data from your question editing one value. Of course you'll have to improve this code for your requirements and by using your complete data.

ggplot(data = df_risk_score, aes(x= issuer_name, y=risk_score, group=issuer_name, fill = issuer_name))   
       geom_boxplot()  
       theme(axis.text.x=element_blank(), axis.ticks.x=element_blank())   
       xlab("Issuer")  
       ylab("Risk score")  
       guides(fill=guide_legend(title="Legend Title: Issuer"))

enter image description here

  • Related