Home > Blockchain >  Combining 2 columns in R
Combining 2 columns in R

Time:12-24

I have a dataset which has 2 columns which I need to combine:

butterfly <- read.csv("Butterfly_Data_All.csv")

Red_Admiral_data <- butterfly[butterfly$Species == 'Red Admiral',]

pop_RA <- Red_Admiral_data$SINDEX` # where `SINDEX` is population index

summer_RA <- Red_Admiral_data$Average_Temp_May_June_July

winter_RA <- Red_Admiral_data$Average_Temp_Nov_Dec_Jan

summer_RA consists of the temperatures over 3 months in the summer, per observation of the butterfly species 'Red Admiral':

13.39 12.24 12.32 12.11 12.41 12.21 12.28 11.83 11.88 11.73 11.99
11.75 14.91 14.83 14.43 14.91 13.46 14.99 14.56 15.04 13.70 11.10
16.04 14.34 15.02 14.30 15.17 14.55 12.82 14.34 13.32 15.32 13.97
14.64 10.27 15.26 14.94 14.22 14.82 14.82 15.15 14.88 14.77 12.64

and winter_RA consists of the temperature over 3 months in the winter.

5.25  5.33  5.31  5.00  5.34  5.24  4.70  7.04  7.03  6.72  7.06
6.30  5.29  5.24  5.82  5.22  5.76  6.56  5.08  5.33  7.15  4.67
5.77  6.58  4.84  6.80  5.06  5.14  6.49  5.80  6.86  5.20  5.54
4.85  3.27  6.29  5.32  5.47  4.78  4.78  5.19  5.05  5.12  4.89
5.25  5.33  5.31  5.00  5.34  5.24  4.70  7.04  7.03  6.72  7.06

The dataset is huge, 21540 entries ommitted, and when I use the merge() function (even with increased memory) it crashes out the software.:

merge(summer_RA,winter_RA)

I am wanting to plot the population index for this species (pop_RA) against the summer and winter temperatures combined but so far can only create a separate plot for each. Hope this makes sense.

dput(summer_RA) output is:

13.49, 12.25, 13.41, 13.18, 13.67, 13.12, 13.72, 13.16, 13.53, 14.01, 13.02, 9.91, 11.51, 13.05, 12.9, 12.32, 13.3, 13.3, 13.03, 13.03, 9.75, 13.13, 13.23, 13.03, 13.45, 10.52, 10.52, 13.64, 
8.26, 12.88, 13.79, 12.46, 9.31, 13.3, 13.98, 13.11, 13.3, 12.85

dput(winter_RA) output is:

4.67, 4.16, 4.51, 4.55, 4.68, 5.13, 4.3, 4.39, 4.16, 5.02, 4.29, 4.17, 4.61, 5.18, 6.15, 6.15, 4.34, 6.15, 5.39, 5.39, 6.01, 5.18, 4.78, 5.39, 4.39, 5.02, 5.02, 4.48, 5.04, 4.13, 3.73, 3.16, 3.36, 4.13, 4.13, 4.13, 4.13, 3.29, 3.39, 3.29, 3.79, 3.79, 3.79, 4.43

pop_RA sample data:

 10   1   2   0   5   0   2   0   0   0   4   0  31   1  27  22 17   3   2  21  33  17  11  21   1  20   4   8  11  13  53   6  51   3  41  43  40   7   7   0   0   8  11  15  22   9   1   0  33   4   0   5   3  15   5   0   1   6   0   0   1   6   1   1

CodePudding user response:

I think there is a simple solution for this:

Let's say you create a "season" data combining winter and summer. You can use data.frame to combine the two datasets, like this:

season <- data.frame(summer, winter)

CodePudding user response:

I'm going to guess that by "combine" you mean "plot together". It's just a guess.

I cannot use the sample data you provided, everything has different lengths (which is not correct for a data.frame), so I'll generate random data in the same "shape".

Red_Admiral_data <- data.frame(pop_RA = c(10,1,2,0,5,0,2,0,0,0,4,0,31,1,27,22,17,3,2,21,33,17,11,21,1,20,4,8,11,13,53,6,51,3,41,43,40,7,7,0,0,8,11,15,22,9,1,0,33,4,0,5,3,15,5,0,1,6,0,0,1,6,1,1))
nrow(Red_Admiral_data)
# [1] 64
set.seed(42)
Red_Admiral_data$summer_RA <- rnorm(64, 12.6, 1.39)
Red_Admiral_data$winter_RA <- rnorm(64, 4.53, 0.78)
head(Red_Admiral_data)
#   pop_RA summer_RA winter_RA
# 1     10  14.50563  3.962712
# 2      1  11.81507  5.545983
# 3      2  13.10475  4.791962
# 4      0  13.47968  5.340035
# 5      5  13.16193  5.248168
# 6      0  12.45249  5.092285

base R

plot(summer_RA ~ pop_RA, data = Red_Admiral_data, type = "p", pch = 16, col = "blue", ylim = c(0, 16))
points(winter_RA ~ pop_RA, data = Red_Admiral_data, pch = 16, col = "red")
legend("bottomright", c("summer_RA", "winter_RA"), col = c("blue", "red"), pch = 16, bg = "white")

base graphics scatterplot

ggplot2

ggplot2 really prefers data in a long format, so we'll first reshape. (Reshaping/pivoting is not always a trivial thing, look for other questions related to using the reshape2 package or tidyr::pivot_*.)

library(ggplot2)
ggplot(reshape2::melt(Red_Admiral_data, "pop_RA", variable.name = "season"), 
       aes(pop_RA, value))  
  geom_point(aes(color = season))

ggplot2 scatterplot, same data

  •  Tags:  
  • r
  • Related