I have a list which is very long. I would like to find out the percentages of clearing the violent cases for each data (percentage of clearing = cleared cases/actual cases).
dput(crime_data)
list(results = list(list(ori = "NY0010000", data_year = 2008L,
offense = "violent-crime", state_abbr = "NY", cleared = 33L,
actual = 33L, data_range = NULL), list(ori = "NY0010000",
data_year = 2009L, offense = "violent-crime", state_abbr = "NY",
cleared = 22L, actual = 24L, data_range = NULL), list(ori = "NY0010100",
data_year = 2008L, offense = "violent-crime", state_abbr = "NY",
cleared = 333L, actual = 1033L, data_range = NULL), list(
ori = "NY0010100", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 372L, actual = 1007L, data_range = NULL),
list(ori = "NY0010200", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 0L, actual = 61L, data_range = NULL),
list(ori = "NY0010200", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 34L, actual = 51L, data_range = NULL),
list(ori = "NY0010300", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 20L, actual = 32L, data_range = NULL),
list(ori = "NY0010300", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 32L, actual = 40L, data_range = NULL),
list(ori = "NY0012000", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 1L, actual = 4L, data_range = NULL),
list(ori = "NY0012000", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 5L, actual = 8L, data_range = NULL),
list(ori = "NY0012100", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 0L, actual = 0L, data_range = NULL),
list(ori = "NY0012100", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 0L, actual = 0L, data_range = NULL),
list(ori = "NY0012500", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 5L, actual = 10L, data_range = NULL),
list(ori = "NY0012500", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 7L, actual = 9L, data_range = NULL),
list(ori = "NY0015100", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 17L, actual = 20L, data_range = NULL),
list(ori = "NY0015100", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 22L, actual = 30L, data_range = NULL),
list(ori = "NY0015200", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 0L, actual = 24L, data_range = NULL),
list(ori = "NY0015200", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 0L, actual = 19L, data_range = NULL),
list(ori = "NY0015300", data_year = 2008L, offense = "violent-crime",
state_abbr = "NY", cleared = 38L, actual = 54L, data_range = NULL),
list(ori = "NY0015300", data_year = 2009L, offense = "violent-crime",
state_abbr = "NY", cleared = 40L, actual = 61L, data_range = NULL)),
pagination = list(count = 1188L, page = 0L, pages = 60L,
per_page = 20L))
I tried to convert it into a dataframe using the standard methods, but I always get "Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0"
Or if I use "df <- data.frame(matrix(unlist(crime_data), nrow=length(crime_data), byrow=TRUE))" I get this
How would I do this better/at all? Is it easier to do it in a list format or convert into DF
CodePudding user response:
The problem is that crime_data
is a list with two parts. All of the crime data is in crime_data[[1]]
with crime_data$pagination
attached to the end. To get what you want just extract the first part:
crime.data <- do.call(rbind, crime_data[[1]])
colnames <- unlist(dimnames(crime.data))
cols <- sapply(colnames, function(x) unlist(crime.data[, x]))
crime.df <- as.data.frame(do.call(cbind, cols))
The first line extracts the crime data from the list. The next three lines extract the column lists and combine them into a data frame. There may be an easier way to do this.