Home > Back-end >  ggplot multiple columns (some columns with lots of NaN) with line and points in one plot in r
ggplot multiple columns (some columns with lots of NaN) with line and points in one plot in r

Time:10-07

This is how my data looks like:

dput(head(dfndvi,40))
structure(list(Date = structure(c(16831, 16832, 16833, 16834, 
16835, 16836, 16837, 16838, 16839, 16840, 16841, 16842, 16843, 
16844, 16845, 16846, 16847, 16848, 16849, 16850, 16851, 16852, 
16853, 16854, 16855, 16856, 16857, 16858, 16859, 16860, 16861, 
16862, 16863, 16864, 16865, 16866, 16867, 16868, 16869, 16870
), class = "Date"), meanNDVIU = c(0.502544405962944, 0.466256152892111, 
0.485749936586605, 0.484970598038591, 0.507275825345815, 0.496315118413844, 
NaN, 0.491017758477861, 0.476710534631678, 0.483511980029393, 
NaN, 0.567598408120424, NaN, NaN, NaN, NaN, 0.558695594290863, 
NaN, 0.553513878235882, 0.561027303167483, 0.542377933374519, 
0.525247093862379, 0.525165485789629, 0.494407000216638, NaN, 
0.507730697700615, NaN, NaN, 0.518554179974028, 0.520132928126137, 
0.510478184465593, 0.498813919672763, 0.489482807510878, 0.486741556185618, 
0.54348112575644, 0.492114023452322, 0.487480778562881, 0.481185556976138, 
0.496910310308397, 0.491002252041978), meanNDVIN = c(0.624320155056735, 
0.606096869013536, 0.613930151747554, 0.632728796482024, 0.656657565582441, 
0.648144161096337, NaN, 0.657236231251206, 0.652550004606334, 
0.669463867745522, NaN, 0.682480792008571, NaN, NaN, NaN, NaN, 
0.607992356542685, NaN, 0.610810820063882, 0.592069819101036, 
0.586041169840559, 0.569066342968534, 0.569951894432553, 0.592324925517102, 
NaN, 0.613678505916875, NaN, NaN, 0.583768302989277, 0.580797932718667, 
0.576118007293509, 0.569094248031576, 0.574556169304909, 0.610140182938907, 
0.628893600721365, 0.583113291980683, 0.593022220290993, 0.575916546870008, 
0.588553885295149, 0.58277978069801), meanNDVIh = c(NaN, NaN, 
NaN, 0.632082969388472, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 
NaN, NaN, 0.51861586348004, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 
NaN, NaN, NaN, NaN, NaN, 0.502459, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 
NaN, NaN), NDVIL = c(NaN, NaN, NaN, NaN, NaN, 0.677403074, NaN, 
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 
NaN, NaN, 0.576714352, NaN, NaN, NaN, NaN, NaN, NaN, 0.634330472, 
0.671945979, NaN, NaN, NaN, NaN, NaN, NaN, 0.657019867, NaN, 
NaN), NDVIM = c(0.657244197, 0.657075217, 0.660955735, 0.660264045, 
0.666078633, 0.665633253, 0.665328286, 0.656959658, 0.653123241, 
0.676091122, 0.682183486, 0.671482632, 0.669793144, 0.673095324, 
0.661495599, 0.660252492, 0.658850497, 0.657205194, 0.655492781, 
0.65396421, 0.652262689, 0.651273521, 0.648992533, 0.64265213, 
0.638398763, 0.635521954, 0.633858586, 0.62670145, 0.625618812, 
0.617843491, 0.635370809, 0.634577687, 0.630754469, 0.635186296, 
0.629314421, 0.625436647, 0.623103318, 0.621268182, 0.621354269, 
0.619649314)), row.names = 188:227, class = "data.frame")

Note: I've added manually 2 more numeric values in the meanNDVIh.

I want to plot using ggplot2 several columns in one plot where the x = Date (always). Each column should have the points geom_point and line geom_line shown in the plot. The problem in plotting, for instance, meanNDVIh, due to the quantity of NaN I only can plot points and not lines. Also, I pretend to connect the points between NaN to see the tendency.

This what I've been doing:

1st I tried the basic plotting ggplot2

ggplot(NDVI_sat_h_COR,aes(x=Date))   
  geom_line(aes(y=meanNDVIU),colour="blue") 
  geom_point(aes(y=meanNDVIU),colour="blue") 
  geom_line(aes(y=meanNDVIh),colour="red") 
  geom_point(aes(y=meanNDVIh),colour="red")

Problem: The numeric points with NaN in the middle are not connected

Then I tried:

ggplot(NDVI_sat_h_COR, aes(Date, meanNDVIh)) 
  geom_point() 
  geom_line(data = NDVI_sat_h_COR[!is.nan(NDVI_sat_h_COR$meanNDVIh),])  
  scale_y_continuous(name="NDVI", limits=c(0.2, 0.8))

ggplot(NDVI_sat_h_COR, aes(Date, meanNDVIU)) 
  geom_point() 
  geom_line(data = NDVI_sat_h_COR[!is.nan(NDVI_sat_h_COR$meanNDVIU),])  
  scale_y_continuous(name="NDVI", limits=c(0.2, 0.8))

This solved my problem but now I cant plot them together.

Sorry for the long question. Any help will be much appreciated.

CodePudding user response:

I am not entirerly sure this is what you want, but I gave it a shot:

The main point to making plotting easier is to bring the data into a tidy structure (one observation per row). Therefore, I filtered the data for the columns actually used in your example, made the data tidy with pivot_longer() and removed the rows with NaN (wouldn't be plotted anyway).

Once the data is tidy you can specify all aesthetics in the ggplot call and don't need to call the geoms multiple times.

NDVI_sat_h_COR %>%
  select(Date, meanNDVIU, meanNDVIh) %>%
  pivot_longer(cols = c("meanNDVIU", "meanNDVIh")) %>%
  drop_na() %>%
  ggplot(aes(x=Date, y = value, group = name, color = name))   
  geom_line() 
  geom_point()

enter image description here

  • Related