Home > Mobile >  Get one of every multiple values in a column
Get one of every multiple values in a column

Time:12-04

I have a dataframe with a column which contains multiple entries of some values - some indicators. I need to sample one of every kind of values. For example let's say we have a tibble from the Gapminder dataset.

    # A tibble: 1,704 x 6
   country     continent  year lifeExp      pop gdpPercap
   <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
 1 Afghanistan Asia       1952    28.8  8425333      779.
 2 Afghanistan Asia       1957    30.3  9240934      821.
 3 Afghanistan Asia       1962    32.0 10267083      853.
 4 Afghanistan Asia       1967    34.0 11537966      836.
 5 Afghanistan Asia       1972    36.1 13079460      740.
 6 Afghanistan Asia       1977    38.4 14880372      786.
 7 Afghanistan Asia       1982    39.9 12881816      978.
 8 Afghanistan Asia       1987    40.8 13867957      852.
 9 Afghanistan Asia       1992    41.7 16317921      649.
10 Afghanistan Asia       1997    41.8 22227415      635.

How do I get a list of countries?

Afghanistan
Albania
Algeria
Andorra
Angola

...and so on. Or for continents:

Africa
Antarctica
Asia
Australia
Europe
North America
South America

CodePudding user response:

library(dplyr)

    df= read.table(
      header = TRUE, text="
    Row  country    continent  year   lifeExp      pop  gdpPercap
    1   Afghanistan Asia       1952    28.8   8425333      779.
    2   Afghanistan Asia       1957    30.3   9240934      821.
    3   Afghanistan Asia       1962    32.0   10267083      853.
    4   Afghanistan Asia       1967    34.0   11537966      836.
    5   Afghanistan Asia       1972    36.1   13079460      740.
    6   Afghanistan Asia       1977    38.4   14880372      786.
    7   Afghanistan Asia       1982    39.9   12881816      978.
    8   Afghanistan Asia       1987    40.8   13867957      852.
    9   Afghanistan Asia       1992    41.7   16317921      649.
    10  Afghanistan Asia       1997    41.8   22227415      635.")


df %>%
  distinct() %>%
  count("country")

 

         country freq
    1 Afghanistan   10

  • Related