Home > Blockchain >  Pivot_wider in tidyr creates list cols even when there is no duplicate or missing data
Pivot_wider in tidyr creates list cols even when there is no duplicate or missing data

Time:12-31

This is my code:


# reading input file
library(readxl)
df_testing <- read_excel("Testing_Data.xlsx")


# Renaming the 1st column name for ease of use
colnames(df_testing)[1] = "Tag_No"
View(df_testing)

# creating a new data frame with columns from the row values
library(tidyr)
df_output = pivot_wider(df_testing, names_from = Tag_No, values_from = Reading)
# the below output is as expected, yet coming in list cols
View(df_output)

# this below code is an attempt to fix but replaces last row values with NA
# df_output = lapply(df_output, unlist)
# df_output = data.frame(lapply(df_output, `length<-`, max(lengths(df_output))))

# level count should be equal to no of columns created
length(levels(df_testing$Tag_No)) == ncol(df_output) - 3

# save output to the file. Since, it is in list cols, I can't save the data to the file
write.csv(df_output, file = "Output File.csv")

This is the input data file link 1

This is the sample of expected output data file link 2

Any changes for the code to work correctly without loosing data or a complete solution is welcome. Thanks in advance. If I misunderstood the concept of pivot_wider usage, kindly give some tips.

CodePudding user response:

The issue is because of NA values. There is around 59 rows with NA in them.

library(readxl)
library(tidyr)

df_testing <- read_excel("Testing_Data.xlsx")
df_testing %>% filter(is.na(`Tag No.`))

# A tibble: 59 x 4
#   `Tag No.` Reading Date                Time               
#   <chr>       <dbl> <dttm>              <dttm>             
# 1 NA             NA NA                  NA                 
# 2 NA             NA NA                  NA                 
# 3 NA             NA NA                  NA                 
# 4 NA             NA NA                  NA                 
# 5 NA             NA NA                  NA                 
# 6 NA             NA NA                  NA                 
# 7 NA             NA NA                  NA                 
# 8 NA             NA NA                  NA                 
# 9 NA             NA NA                  NA                 
#10 NA             NA NA                  NA                 
# … with 49 more rows

Dropping the NA rows doesn't give list columns.

df_output <- pivot_wider(na.omit(df_testing), names_from = `Tag No.`, values_from = Reading)
df_output
  • Related