Home > Enterprise >  Get the latest updated value
Get the latest updated value

Time:10-05

Following this topic Get the latest updated value in R.

I would like to add a new problem appeared when I ran the solution from @TarJae

I made a new example

df <- data.frame(name1 = c("Acacia pinnata", "Acer laurinum", "Acmella paniculata", "Aglaia lawii", NA, NA),
                 name2 = c(NA, NA, NA, NA, "Acer laurinum Hassk.", "Aglaia lawii (Wight)"),
                 name3 = c("Senegalia rugata (Lam.) Britton & Rose", "Acer laurinum", "Acmella paniculata", "Aglaia lawii", "Acer laurinum Hassk.", "Aglaia lawii (Wight)"))
name1                               name2                                  name3
1     Acacia pinnata                 <NA> Senegalia rugata (Lam.) Britton & Rose
2      Acer laurinum                 <NA>                          Acer laurinum
3 Acmella paniculata                 <NA>                     Acmella paniculata
4       Aglaia lawii                 <NA>                           Aglaia lawii
5               <NA> Acer laurinum Hassk.                   Acer laurinum Hassk.
6               <NA> Aglaia lawii (Wight)                   Aglaia lawii (Wight)

Now using this code from @TarJae

we gonna have the output like this (seem to be the rows being raised)

df %>% 
  mutate(id = row_number()) %>% 
  pivot_longer(
    cols = -id
  ) %>% 
  mutate(helper= word(value, 1)) %>% 
  group_by(helper) %>% 
  mutate(value= last(value)) %>% 
  pivot_wider(
    names_from = name,
    values_from = value
  ) %>% 
  ungroup() %>% 
  select(-id, -helper) %>% 
  filter(if_any(everything(), ~ !is.na(.)))
  name1                name2                name3                                 
  <chr>                <chr>                <chr>                                 
1 Acacia pinnata       NA                   NA                                    
2 NA                   NA                   Senegalia rugata (Lam.) Britton & Rose
3 Acer laurinum Hassk. NA                   Acer laurinum Hassk.                  
4 Acmella paniculata   NA                   Acmella paniculata                    
5 Aglaia lawii (Wight) NA                   Aglaia lawii (Wight)                  
6 NA                   Acer laurinum Hassk. Acer laurinum Hassk.                  
7 NA                   Aglaia lawii (Wight) Aglaia lawii (Wight)       

We will see our first row already splited to two row. I am thinking now is to merge 2 this row into one. However, what if I honestly have many cases like that.

Any suggestions for this?

Updated code for example* Since another problem raised I would like to make another table here.

df <- data.frame(name1 = c("Acacia pinnata", "Acer laurinum", "Acmella paniculata", "Aglaia lawii", NA, NA, "Alangium javanicum", "Alangium longiflorum", NA),
                 name2 = c(NA, NA, NA, NA, "Acer laurinum Hassk.", "Aglaia lawii (Wight)", NA,NA, "Alangium javanicum (Blume) Wangerin"),
                 name3 = c("Senegalia rugata (Lam.) Britton & Rose", "Acer laurinum", "Acmella paniculata", "Aglaia lawii", "Acer laurinum Hassk.", "Aglaia lawii (Wight)",
                           "Alangium javanicum", "Celtis cf. rigescens (Miq.) Planch.", "Alangium javanicum (Blume) Wangerin"))

Using the code from @TarJae, we gonna see now Celtis cf. rigescens (Miq.) Planch. is matched with Alangium javanicum (Blume) Wangerin``not Alangium longiflorum`.

Desired output

 name1                               name2                              name3                                                                                                                        
1 Acacia pinnata                      NA                                  Senegalia rugata (Lam.) Britton & ~
2 Acer laurinum Hassk.                NA                                  Acer laurinum Hassk.               
3 Acmella paniculata                  NA                                  Acmella paniculata                 
4 Aglaia lawii (Wight)                NA                                  Aglaia lawii (Wight)               
5 NA                                  Acer laurinum Hassk.                Acer laurinum Hassk.               
6 NA                                  Aglaia lawii (Wight)                Aglaia lawii (Wight)               
7 Alangium javanicum (Blume) Wangerin NA                                  Alangium javanicum (Blume) Wangerin
8 Alangium longiflorum                NA                                  Celtis cf. rigescens (Miq.) Planch.
9 NA                                  Alangium javanicum (Blume) Wangerin Alangium javanicum (Blume) Wangerin

CodePudding user response:

update: The key is here: mutate(helper= word(value, 1,2)) We group by the first two words and then fill the group with the last entry. In case this is not enough you could group by the first 3 words etc...

df %>% 
  pivot_longer(
    cols = everything(),
  ) %>% 
  mutate(helper= word(value, 1,2)) %>% 
  group_by(helper) %>% 
  mutate(value= last(value)) %>% 
  ungroup() %>% 
  select(-helper) %>%
  pivot_wider(
    names_from = name,
    values_from = value,
  ) %>% 
  unnest(cols = c(name1, name2, name3))
  name1                               name2                               name3                              
  <chr>                               <chr>                               <chr>                              
1 Acacia pinnata                      NA                                  Senegalia rugata (Lam.) Britton & ~
2 Acer laurinum Hassk.                NA                                  Acer laurinum Hassk.               
3 Acmella paniculata                  NA                                  Acmella paniculata                 
4 Aglaia lawii (Wight)                NA                                  Aglaia lawii (Wight)               
5 NA                                  Acer laurinum Hassk.                Acer laurinum Hassk.               
6 NA                                  Aglaia lawii (Wight)                Aglaia lawii (Wight)               
7 Alangium javanicum (Blume) Wangerin NA                                  Alangium javanicum (Blume) Wangerin
8 Alangium longiflorum                NA                                  Celtis cf. rigescens (Miq.) Planch.
9 NA                                  Alangium javanicum (Blume) Wangerin Alangium javanicum (Blume) Wangerin

  1. We don't need id
  2. We have to ungroup() after mutate(value= last(value)) and
  3. remove helper before pivot_wider to keep the rows constant.
library(dplyr)
library(tidyr)
library(stringr)

df %>% 
  pivot_longer(
    cols = everything(),
  ) %>% 
  mutate(helper= word(value, 1)) %>% 
  group_by(helper) %>% 
  mutate(value= last(value)) %>% 
  ungroup() %>% 
  select(-helper) %>% 
  pivot_wider(
    names_from = name,
    values_from = value,
  ) %>% 
  unnest(cols = c(name1, name2, name3))

output:

  name1                name2                name3                                 
  <chr>                <chr>                <chr>                                 
1 Acacia pinnata       NA                   Senegalia rugata (Lam.) Britton & Rose
2 Acer laurinum Hassk. NA                   Acer laurinum Hassk.                  
3 Acmella paniculata   NA                   Acmella paniculata                    
4 Aglaia lawii (Wight) NA                   Aglaia lawii (Wight)                  
5 NA                   Acer laurinum Hassk. Acer laurinum Hassk.                  
6 NA                   Aglaia lawii (Wight) Aglaia lawii (Wight) 
  • Related