Home > Enterprise >  Categorizing data based on group and according to a separate column
Categorizing data based on group and according to a separate column

Time:07-20

What I'm asking here feels pretty complicated so bear with me as I try my best to explain it.

This spreadsheet is data on bird nests. Each nest has a specific identifier, ID_NIU. For this identifier, there can be anywhere from 1-3 rows which each represent a specific egg. The amount of rows corresponds to the count, which just tells me how many eggs are in each nest. Don't worry really about what PVC is, it shouldn't matter.

Egg_order is exactly what it sounds like; it's the order the eggs were laid in. If the value is 9, that means that we don't know the order. I need two things to happen in the most efficient way possible.

  1. I want a column made that shows one of three things. Let's call the column Status. I want Status to either be the word non, complete, or estimated. a) if all of the egg_order in a ID_NIU is 9, give it the value: none b) if all of the egg_order shows a number, like 1, 2, or 3 (so basically a complete set), give it the value: complete c) there can be an instance where only the THIRD egg is known. if that happens, give the value: estimated (this happens for ID_NIU 426 for example)

  2. For the estimated ID_NIU (based on the categorization above), I want the two 9 values to be replaced by 1 or 2. Basically, I just want to assign them a random order.

Here is my data

> dput(nests1994)
structure(list(ID_NIU = c(421L, 421L, 421L, 422L, 422L, 423L, 
423L, 423L, 424L, 424L, 425L, 425L, 425L, 426L, 426L, 426L, 427L, 
427L, 428L, 428L, 428L, 429L, 429L, 429L, 430L, 430L, 431L, 431L, 
431L, 432L, 432L, 432L, 433L, 433L, 433L, 434L, 434L, 434L, 435L, 
435L, 435L, 436L, 436L, 437L, 437L, 437L, 438L, 438L, 438L, 439L, 
439L, 439L, 440L, 440L, 440L, 441L, 441L, 441L, 442L, 442L, 442L, 
443L, 443L, 443L, 444L, 445L, 445L, 445L, 446L, 446L, 446L, 447L, 
447L, 447L, 448L, 448L, 448L, 449L, 449L, 449L, 450L, 450L, 451L, 
451L, 451L, 452L, 452L, 452L, 453L, 453L, 453L, 454L, 454L, 454L, 
456L, 456L, 456L, 457L, 457L, 457L, 458L, 458L, 458L, 459L, 460L, 
460L, 460L, 461L, 461L, 461L, 462L, 462L, 462L, 463L, 463L, 463L, 
464L, 464L, 464L, 465L, 465L, 465L, 466L, 466L, 466L, 467L, 467L, 
467L, 468L, 468L, 468L, 469L, 469L, 469L, 470L, 470L, 470L, 471L, 
471L, 471L, 472L, 472L, 472L, 473L, 473L, 473L, 474L, 474L, 474L, 
475L, 475L, 475L, 476L, 476L, 476L, 477L, 477L, 477L, 478L, 478L, 
478L, 479L, 479L, 479L, 480L, 480L, 480L, 481L, 481L, 481L, 482L, 
482L, 483L, 484L, 484L, 484L, 485L, 485L, 485L, 486L, 486L, 486L, 
487L, 488L, 488L, 488L, 489L, 489L, 489L, 490L, 490L, 490L, 491L, 
492L, 492L, 492L, 493L, 493L, 493L, 494L, 494L, 494L, 495L, 495L, 
496L, 497L, 497L, 497L, 498L, 498L, 555L, 555L, 555L, 556L, 556L, 
556L, 557L, 557L, 557L, 558L, 558L, 558L, 559L, 559L, 559L, 560L, 
560L, 560L, 561L, 561L, 561L, 562L, 562L, 562L, 563L, 563L, 563L, 
564L, 564L, 564L, 565L, 565L, 565L, 566L, 566L, 567L, 567L, 567L, 
572L, 572L, 572L, 573L, 573L, 573L, 574L, 574L, 574L, 575L, 575L, 
575L, 576L, 576L, 576L, 577L, 577L, 577L, 578L, 578L, 578L, 579L, 
579L, 579L, 594L, 594L, 594L, 595L, 595L, 597L, 597L, 597L, 598L, 
598L, 599L, 599L, 612L, 612L, 612L, 613L, 614L, 614L, 615L, 615L, 
615L, 616L, 616L, 617L, 617L, 618L, 618L, 618L, 620L, 620L, 620L, 
622L, 622L, 622L, 624L, 624L, 624L, 625L, 625L, 625L, 626L, 626L, 
626L, 627L, 627L, 627L, 628L, 628L, 628L, 629L, 629L, 629L, 633L, 
633L, 633L, 661L, 661L, 662L, 662L, 662L, 663L, 663L, 663L, 664L, 
664L, 664L, 665L, 665L, 665L, 666L, 666L, 666L, 667L, 667L, 667L, 
668L, 668L, 668L, 669L, 669L, 669L, 670L, 673L, 673L, 674L, 674L, 
674L, 675L, 675L, 675L, 676L, 676L, 676L, 677L, 677L, 677L, 679L, 
679L, 679L, 680L, 680L, 681L, 681L, 682L, 682L, 682L, 683L, 683L, 
683L, 684L, 684L, 684L, 685L, 685L, 685L, 686L, 686L, 686L, 687L, 
687L, 687L, 688L, 688L, 688L, 689L, 689L, 689L, 690L, 690L, 690L, 
691L, 691L, 691L, 692L, 692L, 692L, 791L, 793L, 793L, 793L, 794L, 
794L, 794L), Egg_order = c(1L, 2L, 3L, 1L, 2L, 1L, 2L, 3L, 9L, 
9L, 1L, 2L, 3L, 9L, 9L, 3L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 3L, 9L, 
9L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 9L, 9L, 3L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 2L, 1L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 9L, 
9L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 9L, 9L, 3L, 9L, 9L, 
3L, 1L, 2L, 1L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 1L, 1L, 2L, 3L, 1L, 2L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 1L, 2L, 3L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 9L, 9L, 
9L, 9L, 9L, 9L, 1L, 2L, 3L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L), PVC = c(NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, "VNRB", "VNRB", "VNRB", "VRNB", "VJNZ", "VJNZ", 
"VBRN", "VBRN", "VBRN", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "VRJA", "VRJA", 
"VRJA", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA), volume = c(61.32180296484, 59.19889832633, 61.23059740728, 
56.63892668014, 51.205468528125, 56.00474267058, 57.351199948, 
55.54398620844, 55.70776780426, 59.72462114472, 59.37718398586, 
60.587478792225, 56.43498962184, 62.26391957376, 65.6381047725, 
60.0863309088, 64.85091746762, 60.069900631215, 59.4870511833, 
57.0005821184, 54.02542528228, 64.73294631594, 66.3508141931, 
62.32245834917, 59.02536867768, 66.01653978486, 60.083474171985, 
59.684503147035, 60.074662267125, 66.157670868, 64.14003788994, 
64.72516250826, 65.43246736143, 64.76946298056, 59.525282639625, 
64.83564670912, 63.10304890812, 59.162532552, 58.56437083104, 
58.51350718936, 60.47152094199, 59.46823641825, 56.529403558675, 
62.59313732544, 62.86182498192, 57.4115839709, 67.488455384, 
70.67781620074, 69.066635590395, 61.4648739568, 61.793295785165, 
57.141971818575, 57.86003450088, 66.8841551616, 54.2789360685, 
57.372537480805, 54.7688704779, 52.69147903188, 65.717390741625, 
62.0012941321, 65.46154452084, 67.54235220404, 68.589406082885, 
61.3698407482, 63.22307904, 73.52276685129, 67.7605920462, 64.9948112, 
67.39168007286, 59.96388195, 53.75359412352, 61.2940024137, 62.371936324995, 
58.27667400405, 61.00711558812, 60.68288712456, 57.22440302628, 
63.7912474123, 60.71915094122, 56.17566934848, 61.03055081916, 
54.48216125544, 66.59971341696, 62.914325579595, 60.06312844672, 
63.18295147084, 64.13649043698, 54.61024964874, 58.69608972288, 
60.093902811735, 55.80205524, 66.35080327284, 64.56965497461, 
61.61846995898, 58.962280801605, 54.114679926855, 51.32463260472, 
66.763949034375, 66.60433698016, 57.9960342297, 65.65621056552, 
63.50880453696, 59.93124997824, 57.28414633702, 68.946491941355, 
69.53883617025, 60.591651012, 66.51464159232, 62.83111160808, 
59.31893121786, 66.012559966525, 65.39124391634, 61.63458903486, 
64.527880462335, 65.626394311215, 62.28965754018, 70.66749192426, 
63.43547903014, 64.813554775305, 63.30184211559, 70.84510569583, 
65.07164575845, 62.649526678335, 62.07063053623, 62.97169774328, 
58.15387242406, 53.775184290885, 56.15800014276, 63.84106319499, 
58.0287792375, 55.39166100811, 67.06270568448, 64.243137862495, 
61.79094929648, 59.278741074, 57.54892103096, 54.4025191125, 
68.63881395584, 65.99016072992, 61.71771694458, 56.032400073, 
59.67369896548, 55.55819890452, 59.74163316768, 64.678925498375, 
59.632117506, 58.352016509955, 60.27802279574, 58.15564421248, 
54.715232384505, 50.918826189105, 47.311723228, 60.24653229088, 
62.9644540122, 58.28735951144, 67.67941018152, 69.600416585815, 
63.089844045435, 65.63935543424, 65.14029179418, 56.9031766875, 
63.86502197092, 68.50758708486, 66.796303493015, 62.558885508, 
62.68239015272, 59.33579351744, 55.87331866844, 57.779988655095, 
57.4011795375, 68.10503423888, 62.19287447292, 62.29393628778, 
59.54119382656, 62.45667881766, 55.88073186822, 57.19354986962, 
60.62398562056, 53.222328389025, 64.17951754275, 63.46964125062, 
54.25258541904, 53.85227447392, 62.667268102495, 65.9379849576, 
57.10713872136, 57.16095228242, 62.99889140907, 56.223702755685, 
60.56377100622, 62.599569594775, 60.431980928505, 64.63537220204, 
56.8203240696, 59.08446543104, 58.51081609896, 64.82126705664, 
66.14591297517, 58.5345118041, 69.089203199115, 70.813911929975, 
63.129352317125, 62.864854839, 61.69528927, 62.66206973912, 60.8490630645, 
62.4062419139, 57.63741735136, 60.30546945984, 56.07545557164, 
59.79910741368, 63.51030603488, 64.244819090745, 63.75469472867, 
63.14277341512, 62.42052636864, 55.59848049998, 59.7551696205, 
55.20343376232, 60.1519184288, 63.59769165096, 55.71145827984, 
61.30230509475, 56.592753090795, 58.582661261135, 62.62695713592, 
57.883156922925, 63.30930401034, 70.73778793872, 63.12040753536, 
67.936748153665, 62.77154643396, 56.5502664375, 56.676188799945, 
58.8897719179, 65.12593779432, 59.877119390625, 57.5611109065, 
60.50238159481, 51.421804965475, 60.702761718885, 57.857797190545, 
51.76334378277, 54.79479091296, 58.036980255275, 64.08885388248, 
58.9371852075, 48.065806272, 58.67924387256, 60.84988863732, 
55.348644357, 60.809342925895, 56.891942571875, 55.16386692882, 
64.568934378585, 64.46865534354, 58.50186242472, 63.15022238365, 
57.54034615239, 55.58259809634, 58.5784229184, 60.76172518406, 
48.88389467288, 62.93741556969, 65.16504696727, 58.68311406489, 
69.20272240048, 67.510966264695, 63.60102202762, 58.3678358505, 
62.78118931648, 50.330023359885, 62.36847890792, 61.459224858675, 
58.101076579725, 59.849752446945, 60.89355981824, 70.38978991616, 
64.78264300106, 63.851029018155, 56.76600004895, 56.36582838225, 
57.656775733025, 57.07565196264, 62.78661705624, 58.80025808496, 
62.473741510995, 52.27790378878, 61.65878807979, 69.9430447975, 
62.6025867725, 70.091062468875, 55.31320436328, 57.091324193955, 
59.88525953314, 62.81526020564, 58.817878683495, 62.600767032615, 
58.88047597916, 52.271088452865, 56.4597812485, 56.49555113631, 
58.59997224246, 60.56042671782, 63.62604866394, 54.689279244925, 
64.08885388248, 58.9371852075, 48.065806272, 58.67924387256, 
60.84988863732, 55.348644357, 60.809342925895, 56.891942571875, 
55.16386692882, 64.568934378585, 64.46865534354, 58.50186242472, 
58.5784229184, 60.76172518406, 48.88389467288, 58.3678358505, 
62.78118931648, 50.330023359885, 59.36245744586, 54.203960064, 
46.80082663263, 53.7591748284, 56.45379150432, 58.18760585216, 
51.57368944, 61.17471364797, 55.666216654855, 63.638574522745, 
62.859327856115, 55.71589759736, 62.194594993445, 60.1319744625, 
63.2669776704, 52.67023014204, 53.95264575354, 52.19798075712, 
64.12727076352, 58.722978252375, 61.129745915095, 60.35801243148, 
53.570630477, 60.42489898518, 58.83334580928, 58.579377003125, 
53.63492716368, 62.198328240205, 59.16052306314, 63.03268283832, 
53.4038938596, 59.3680844275, 64.01347375464, 53.247602726, 58.656773926835, 
62.406351950215, 60.794167405875, 57.26072539408, 58.41489611248, 
59.89059546628, 58.27766415192, 55.752501203565, 52.67023014204, 
53.95264575354, 52.19798075712, 68.84950003552, 72.3285771465, 
69.97203251224, 58.01968982825, 59.19871381875, 51.0058844818375, 
55.871632176, 65.005463693925, 61.884602267345, 59.36399202272, 
55.61013283256, 63.119620683, 65.19505402688, 65.256547524625, 
60.287946825, 63.546970479, 55.83345829936, 57.83224146336, 62.76964734468, 
63.3625642155, 61.089794891755, 67.23069090625, 59.524399629225, 
61.86069234304, 58.52247185176, 60.14631038544, 55.03887100416, 
57.43869855744, 60.64346718786, 56.9845087625, 60.2844366375, 
57.7861645641, 54.31168897016, 52.9873440048, 53.708649730785, 
65.219353008, 57.88846182528, 54.873862054245, 53.072090597655, 
78.369980834265, 57.516393056295, 61.3480561875, 53.67391420878, 
57.29314740352, 54.1968033555, 52.85290720594), Year = c("1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994", "1994", "1994", 
"1994", "1994", "1994", "1994", "1994", "1994"), count = c(3L, 
3L, 3L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 1L, 3L, 3L, 3L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 2L, 
2L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -415L), groups = structure(list(
    ID_NIU = c(421L, 422L, 423L, 424L, 425L, 426L, 427L, 428L, 
    429L, 430L, 431L, 432L, 433L, 434L, 435L, 436L, 437L, 438L, 
    439L, 440L, 441L, 442L, 443L, 444L, 445L, 446L, 447L, 448L, 
    449L, 450L, 451L, 452L, 453L, 454L, 456L, 457L, 458L, 459L, 
    460L, 461L, 462L, 463L, 464L, 465L, 466L, 467L, 468L, 469L, 
    470L, 471L, 472L, 473L, 474L, 475L, 476L, 477L, 478L, 479L, 
    480L, 481L, 482L, 483L, 484L, 485L, 486L, 487L, 488L, 489L, 
    490L, 491L, 492L, 493L, 494L, 495L, 496L, 497L, 498L, 555L, 
    556L, 557L, 558L, 559L, 560L, 561L, 562L, 563L, 564L, 565L, 
    566L, 567L, 572L, 573L, 574L, 575L, 576L, 577L, 578L, 579L, 
    594L, 595L, 597L, 598L, 599L, 612L, 613L, 614L, 615L, 616L, 
    617L, 618L, 620L, 622L, 624L, 625L, 626L, 627L, 628L, 629L, 
    633L, 661L, 662L, 663L, 664L, 665L, 666L, 667L, 668L, 669L, 
    670L, 673L, 674L, 675L, 676L, 677L, 679L, 680L, 681L, 682L, 
    683L, 684L, 685L, 686L, 687L, 688L, 689L, 690L, 691L, 692L, 
    791L, 793L, 794L), .rows = structure(list(1:3, 4:5, 6:8, 
        9:10, 11:13, 14:16, 17:18, 19:21, 22:24, 25:26, 27:29, 
        30:32, 33:35, 36:38, 39:41, 42:43, 44:46, 47:49, 50:52, 
        53:55, 56:58, 59:61, 62:64, 65L, 66:68, 69:71, 72:74, 
        75:77, 78:80, 81:82, 83:85, 86:88, 89:91, 92:94, 95:97, 
        98:100, 101:103, 104L, 105:107, 108:110, 111:113, 114:116, 
        117:119, 120:122, 123:125, 126:128, 129:131, 132:134, 
        135:137, 138:140, 141:143, 144:146, 147:149, 150:152, 
        153:155, 156:158, 159:161, 162:164, 165:167, 168:170, 
        171:172, 173L, 174:176, 177:179, 180:182, 183L, 184:186, 
        187:189, 190:192, 193L, 194:196, 197:199, 200:202, 203:204, 
        205L, 206:208, 209:210, 211:213, 214:216, 217:219, 220:222, 
        223:225, 226:228, 229:231, 232:234, 235:237, 238:240, 
        241:243, 244:245, 246:248, 249:251, 252:254, 255:257, 
        258:260, 261:263, 264:266, 267:269, 270:272, 273:275, 
        276:277, 278:280, 281:282, 283:284, 285:287, 288L, 289:290, 
        291:293, 294:295, 296:297, 298:300, 301:303, 304:306, 
        307:309, 310:312, 313:315, 316:318, 319:321, 322:324, 
        325:327, 328:329, 330:332, 333:335, 336:338, 339:341, 
        342:344, 345:347, 348:350, 351:353, 354L, 355:356, 357:359, 
        360:362, 363:365, 366:368, 369:371, 372:373, 374:375, 
        376:378, 379:381, 382:384, 385:387, 388:390, 391:393, 
        394:396, 397:399, 400:402, 403:405, 406:408, 409L, 410:412, 
        413:415), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -151L), .drop = TRUE))

If you have questions please ask.

CodePudding user response:

How about this egg_status (ad 1), estimated_Egg_order (ad 2):

nest1994 |> 
  group_by(ID_NIU) |> 
  mutate(egg_status = case_when(all(Egg_order == 9) ~ "none",
                                any(Egg_order == 9) ~ "estimated",
                                TRUE ~ "complete")
         ) |> 
  mutate(estimated_Egg_order = if_else(egg_status == "estimated" & Egg_order == 9, row_number(), Egg_order)) |>
  ungroup()

Output:

# A tibble: 30 × 8
   ID_NIU Egg_order PVC   volume Year  count egg_status estimated_Egg_order
    <int>     <int> <chr>  <dbl> <chr> <int> <chr>                    <int>
 1    421         1 NA      61.3 1994      3 complete                     1
 2    421         2 NA      59.2 1994      3 complete                     2
 3    421         3 NA      61.2 1994      3 complete                     3
 4    422         1 NA      56.6 1994      2 complete                     1
 5    422         2 NA      51.2 1994      2 complete                     2
 6    423         1 NA      56.0 1994      3 complete                     1
 7    423         2 NA      57.4 1994      3 complete                     2
 8    423         3 NA      55.5 1994      3 complete                     3
 9    424         9 NA      55.7 1994      2 none                         9
10    424         9 NA      59.7 1994      2 none                         9
11    425         1 NA      59.4 1994      3 complete                     1
12    425         2 NA      60.6 1994      3 complete                     2
13    425         3 NA      56.4 1994      3 complete                     3
14    426         9 NA      62.3 1994      3 estimated                    1
15    426         9 NA      65.6 1994      3 estimated                    2
16    426         3 NA      60.1 1994      3 estimated                    3
17    427         1 NA      64.9 1994      2 complete                     1
18    427         2 NA      60.1 1994      2 complete                     2
19    428         1 NA      59.5 1994      3 complete                     1
20    428         2 NA      57.0 1994      3 complete                     2

CodePudding user response:

We could use if/else or case_when to create the column 'Status' based on the conditions

library(dplyr)
nests1994 <- nests1994 %>% 
 group_by(ID_NIU) %>%
 mutate(Status = case_when(all(Egg_order %in% 9) ~ 'none', 
                    all(Egg_order %in% c(1, 2, 3)) ~ 'complete', 
                    all(setdiff(Egg_order, 9) == 3) ~ 'estimated' )) %>%    
 mutate(estimated_Egg_order = replace(Egg_order,
      Egg_order == 9, seq_len(sum(Egg_order == 9)))) %>%
 ungroup

-output

> nests1994 %>% filter(Egg_order == 9)
# A tibble: 195 × 8
   ID_NIU Egg_order PVC   volume Year  count Status    estimated_Egg_order
    <int>     <int> <chr>  <dbl> <chr> <int> <chr>                   <int>
 1    424         9 <NA>    55.7 1994      2 none                        1
 2    424         9 <NA>    59.7 1994      2 none                        2
 3    426         9 <NA>    62.3 1994      3 estimated                   1
 4    426         9 <NA>    65.6 1994      3 estimated                   2
 5    430         9 <NA>    59.0 1994      2 none                        1
 6    430         9 <NA>    66.0 1994      2 none                        2
 7    448         9 <NA>    61.0 1994      3 estimated                   1
 8    448         9 <NA>    60.7 1994      3 estimated                   2
 9    476         9 <NA>    60.2 1994      3 estimated                   1
10    476         9 <NA>    63.0 1994      3 estimated                   2
# … with 185 more rows

CodePudding user response:

My solution first groups by ID_NIU, then uses some logic to determine the rest. It also replaces the earlier values in estimated columns, and includes some extra row_number and rev(row_number()) to catch potential values outside of 1, 2, and 3:

nests1994 %>%
  group_by(ID_NIU) %>%
  mutate(status = case_when(
    all(Egg_order == 9) ~ "None",
    Egg_order == row_number() | Egg_order == rev(row_number()) | Egg_order %in% c(1, 2, 3) ~ "Complete",
    first(Egg_order) == 9 & last(Egg_order) != 9 ~ "Estimated",
    TRUE ~ "Miscellaneous"
  ),
  Egg_order = if_else(status == "Estimated" & Egg_order == 9, row_number(), Egg_order))
  •  Tags:  
  • r
  • Related