How to convert a long lat data dataframe to a xyz dataframe using R?-CodePudding

I want to convert the data frame below (dataset) into a xyz format, getting the longitude (x variable, the first 36 values in each column of the first row), latitude (y variable, the first 30 values in each row of the first column) and the current value of each x-y ordered pair (z variable). Is it possible to do it using r?

The data comprises spatially distributed rainfall values (most values are equal or close to zero). Longitude ranges from -47.8 to -36.2, and latitude ranges from -21.2 to -7.2.

> dataset <- read.csv("rainfall_data.csv", header = FALSE, sep = ",")
> headTail(dataset)

    V1    V2    V3  V4    V5    V6    V7    V8  V9   V10   V11   V12   V13 V14   V15   V16   V17   V18 V19   V20
1    <NA> -47.8 -47.4 -47 -46.6 -46.2 -45.8 -45.4 -45 -44.6 -44.2 -43.8 -43.4 -43 -42.6 -42.2 -41.8 -41.4 -41 -40.6
2   -21.2     0     0   0     0     0     0     0   0     0     0     0     0   0     0     0     0     0   0     0
3   -20.8     0     0   0     0     0     0     0   0     0     0     0     0   0     0     0     0     0   0     0
4   -20.4     0     0   0     0     0     0     0   0     0     0     0     0   0     0     0     0     0   0     0
...   ...   ...   ... ...   ...   ...   ...   ... ...   ...   ...   ...   ... ...   ...   ...   ...   ... ...   ...
34   -8.4     0     0   0     0     0     0     0   0     0     0     0     0   0     0     0     0     0   0     0
35     -8     0     0   0     0     0     0     0   0     0     0     0     0   0     0     0     0     0   0     0
36   -7.6     0     0   0     0     0     0     0   0     0     0     0     0   0     0     0     0     0   0     0
37   -7.2     0     0   0     0     0     0     0   0     0     0     0     0   0     0     0     0     0   0     0
      V21   V22   V23 V24   V25   V26   V27   V28 V29   V30   V31
1   -40.2 -39.8 -39.4 -39 -38.6 -38.2 -37.8 -37.4 -37 -36.6 -36.2
2       0     0     0   0     0     0     0     0   0     0     0
3       0     0     0   0     0     0     0     0   0     0     0
4       0     0     0   0     0     0     0     0   0     0     0
...   ...   ...   ... ...   ...   ...   ...   ... ...   ...   ...
34      0     0     0   0     0     0     0     0   0     0     0
35      0     0     0   0     0     0     0     0   0     0     0
36      0     0     0   0     0     0     0     0   0     0     0
37      0     0     0   0     0     0     0     0   0     0     0

Possible outcome:

     x     y     z
1    -47.8 -21.1 0
2    -47.4 -21.1 0
3    -47.0 -21.1 0
4    -46.6 -21.1 0
...   ...  ...  ...
1077 -37.4 -7.2  0
1078 -37.0 -7.2  0
1079 -36.6 -7.2  0
1080 -36.2 -7.2  0

Data are available here: https://drive.google.com/file/d/1IQqoV1WsrEXQoorZmuE4DEA8wzJkI3GL/view?usp=sharing

CodePudding user response：

One option would be to convert the V1 to row names and row 1 to the column names. Then, I convert to matrix so that I could convert to table, then convert into a data frame.

library(tidyverse)
library(janitor)

results <- df %>%
  column_to_rownames("V1") %>%
  row_to_names(1) %>%
  as.matrix() %>%
  as.table() %>%
  as.data.frame()

Or in base R:

row.names(df) <- df[,1]
df <- df[,-1]
colnames(df) <- df[1,]
df <- df[-1,]

as.data.frame(as.table(as.matrix(df)))

Output

 head(results)

   Var1  Var2 Freq
1 -21.2 -47.8    0
2 -20.8 -47.8    0
3 -20.4 -47.8    0
4 -21.2 -47.4    0
5 -20.8 -47.4    0
6 -20.4 -47.4    0

Data

df <- structure(list(V1 = c("<NA>", "-21.2", "-20.8", "-20.4"), V2 = c(-47.8, 
0, 0, 0), V3 = c(-47.4, 0, 0, 0), V4 = c(-47L, 0L, 0L, 0L), V5 = c(-46.6, 
0, 0, 0), V6 = c(-46.2, 0, 0, 0), V7 = c(-45.8, 0, 0, 0), V8 = c(-45.4, 
0, 0, 0), V9 = c(-45L, 0L, 0L, 0L), V10 = c(-44.6, 0, 0, 0), 
    V11 = c(-44.2, 0, 0, 0), V12 = c(-43.8, 0, 0, 0), V13 = c(-43.4, 
    0, 0, 0), V14 = c(-43L, 0L, 0L, 0L), V15 = c(-42.6, 0, 0, 
    0), V16 = c(-42.2, 0, 0, 0), V17 = c(-41.8, 0, 0, 0), V18 = c(-41.4, 
    0, 0, 0), V19 = c(-41L, 0L, 0L, 0L), V20 = c(-40.6, 0, 0, 
    0)), class = "data.frame", row.names = c(NA, -4L))

CodePudding user response：

Please check this.

rainfall_file <- "path/to/csv/file"
dataset <- read.csv(rainfall_file)
library(tidyverse)
results <- dataset %>%
    pivot_longer(cols = !X,names_to = "Variable", values_to = "Count", names_prefix = "X.", names_transform = list(Variable = as.double), values_drop_na = TRUE)