Home > other >  How to use boxplot to plot data like KEY / VALUE / COUNT-OBSERVATIONS?
How to use boxplot to plot data like KEY / VALUE / COUNT-OBSERVATIONS?

Time:02-01

I've got a large file that looks like this:

SAMPLE1 10
SAMPLE1 10
SAMPLE1 10
SAMPLE1 2
SAMPLE2 10
SAMPLE2 10
SAMPLE2 2
SAMPLE2 2

the file is huge (several gigabytes) and R is killed when I want to read the file and then useboxplot. So my idea is to use sort | uniq -c on my file and to use a much smaller file that would now look like this ( with a 3rd column containing the number of observations):

SAMPLE1 10 3
SAMPLE1 2 1
SAMPLE2 10 2
SAMPLE2 2 2

Is there a way to use base:boxplot to plot such data ?

CodePudding user response:

Here's a package ENmisc with a function wtd.boxplot. https://www.rdocumentation.org/packages/ENmisc/versions/1.2-7/topics/wtd.boxplot

Alternatively, calculate the weighted quartiles and then draw the boxplot using those values.

CodePudding user response:

We can pre-compute 5 numbers per sample (min, low, mid, upper, max) in bash. Then data would be small enough to import to R, then we can boxplot using summary data:

  •  Tags:  
  • Related