Home > OS >  Split vectors in each row in a dataframe in half
Split vectors in each row in a dataframe in half

Time:10-31

I have a dataframe in R with one column. Each row has a vector of varying amounts of coordinates in that column demonstrated below:

       Column
1      c(66.3010025028633, 66.5439987180665, 86.6060028079833, 86.6549987795701)
2      c(66.5439987180665, 62.463001250907, 61.5060005190088, 58.0180015560271, 
         54.5610008237291, 50.770000457655, 47.522998810147, 46.0629997251572, 
         86.6549987795701, 86.7549972532207, 86.8050003054198, 86.7870025636175, 
         86.7460021972437, 86.7060012816339, 86.6409988399246, 86.597999572922)
3      c(46.0629997251572, 46.313999176321, 86.597999572922, 86.5609970096044)
4      c(70.0894851683059, 66.3010025028633, 86.4039611818631, 86.6060028079833)

Exact dataframe data is:

dput(BoundaryCoordinates2[1:4,]) list(c(66.3010025028633, 66.5439987180665, 86.6060028079833, 86.6549987795701), c(66.5439987180665, 62.463001250907, 61.5060005190088, 58.0180015560271, 54.5610008237291, 50.770000457655, 47.522998810147, 46.0629997251572, 86.6549987795701, 86.7549972532207, 86.8050003054198, 86.7870025636175, 86.7460021972437, 86.7060012816339, 86.6409988399246, 86.597999572922), c(46.0629997251572, 46.313999176321, 86.597999572922, 86.5609970096044), c(70.0894851683059, 66.3010025028633, 86.4039611818631, 86.6060028079833))

My end goal is to match up the coordinates to create location points.

Unfortunately, each vector lists coordinates by all longitude, then all latitude, and each vector has a different number of coordinates. I want to split each vector in half to form two different vectors that I can work with more easily to match up coordinate pairs.

Example: from the vector in row 1, I am wanting to pair the first longitude (66.30...) with the first latitude (86.60...) and then the second longitude with the second latitude.

My next step is wanting to split the column into two columns, preserving the rows. The first column containing the first half of original vector (longitudes) and the second column containing the second half of the original vector (latitudes).

I cannot make the original vectors into strings to split, because each vector has a different number of coordinates.

When I use split() as:

newDF <- split(oldDF, 2)

It returns a list with one column of all of the values and one column with "Type" that lists out how many values are in the Value column.

If I use

newDF <- split(oldDF$column, 2)

It returns an error that says the first argument must be a vector.

Unnesting the column with unnest_wider() also will not help the current step, as then I cannot easily match up the coordinates. I need to divide in half before I unnest.

Thanks for any insight!

CodePudding user response:

Your data.frame has a list-column. These are a little tricky to maintain in base R, but if you use I(), it holds together.
Because the column is a list, you need lapply to iterate through it.
I made up my own data:

set.seed(42)
df <- data.frame(Column = I(list(rnorm(4), rnorm(16), rnorm(2), rnorm(8))))
df$Column
[[1]]
[1]  1.3709584 -0.5646982  0.3631284  0.6328626

[[2]]
 [1]  0.40426832 -0.10612452  1.51152200 -0.09465904  2.01842371 -0.06271410  1.30486965  2.28664539 -1.38886070 -0.27878877
[11] -0.13332134  0.63595040 -0.28425292 -2.65645542 -2.44046693  1.32011335

[[3]]
[1] -0.3066386 -1.7813084

[[4]]
[1] -0.1719174  1.2146747  1.8951935 -0.4304691 -0.2572694 -1.7631631  0.4600974 -0.6399949

Here's the splitting:

split_df <- lapply(df$Column, \(x) data.frame(
  lngs = I(list(x[1:(length(x) / 2)])), 
  lats = I(list(x[(length(x) / 2   1):length(x)]))))
split_df <- do.call('rbind', split_df)                 
split_df
          lngs         lats
1 1.370958.... 0.363128....
2 0.404268.... -1.38886....
3 -0.30663.... -1.78130....
4 -0.17191.... -0.25726....

This is just the splitting part you talked about. If you want the matching-up, you'll need to be specific about what the result should look like.

  • Related