Home > other >  iterate over columns to count words in a sentence and put it in a new column
iterate over columns to count words in a sentence and put it in a new column

Time:04-08

I have some columns titles essay 0-9, I want to iterate over them count the words and then make a new column with the number of words. so essay0 will get a column essay0_num with 5 if that is how many words it has in it.

so far i got cupid <- cupid %>% mutate(essay9_num = sapply(strsplit(essay9, " "), length)) to count the words and add a column but i don't want to do it one by one for all 10.

i tried a for loop:

for (i in 0:31) {
   cupid <- cupid %>% mutate(xxx_num = sapply(strsplit(xxx, " "), length))
}

but i am not sure how iterate the columns in a for loop in R. I thought maybe i can pull out the columns i need and put them into a new df and use sapply somehow that way? but i still run into the problem of iterating over the columns.

dput:

dput(head(cupid))
structure(list(age = c(22L, 35L, 38L, 23L, 29L, 29L), status = c("single", 
"single", "available", "single", "single", "single"), sex = c("m",
"m", "m", "m", "m", "m"), orientation = c("straight", "straight",
"straight", "straight", "straight", "straight"), body_type = c("a little extra",
"average", "thin", "thin", "athletic", "average"), diet = c("strictly anything",
"mostly other", "anything", "vegetarian", "", "mostly anything"
), drinks = c("socially", "often", "socially", "socially", "socially", 
"socially"), drugs = c("never", "sometimes", "", "", "never",
""), education = c("working on college/university", "working on space camp",
"graduated from masters program", "working on college/university",
"graduated from college/university", "graduated from college/university"
), ethnicity = c("asian, white", "white", "", "white", "asian, black, other",
"white"), height = c(75, 70, 68, 71, 66, 67), income = c(-1L, 
80000L, -1L, 20000L, -1L, -1L), job = c("transportation", "hospitality / travel",
"", "student", "artistic / musical / writer", "computer / hardware / software"
), last_online = c("2012-06-28-20-30", "2012-06-29-21-41", "2012-06-27-09-10",
"2012-06-28-14-22", "2012-06-27-21-26", "2012-06-29-19-18"),
    location = c("south san francisco, california", "oakland, california",
    "san francisco, california", "berkeley, california", "san francisco, california", 
    "san francisco, california"), offspring = c("doesn't have kids, but might want them",
    "doesn't have kids, but might want them", "", "doesn't want kids",
    "", "doesn't have kids, but might want them"), pets = c("likes dogs and likes cats",
    "likes dogs and likes cats", "has cats", "likes cats", "likes dogs and likes cats",
    "likes cats"), religion = c("agnosticism and very serious about it",
    "agnosticism but not too serious about it", "", "", "", "atheism"
    ), sign = c("gemini", "cancer", "pisces but it doesn&rsquo;t matter",
    "pisces", "aquarius", "taurus"), smokes = c("sometimes",
    "no", "no", "no", "no", "no"), speaks = c("english", "english (fluently), spanish (poorly), french (poorly)",
    "english, french, c  ", "english, german (poorly)", "english",
    "english (fluently), chinese (okay)"), essay0 = c("about me:  i would love to think that i was some some kind of intellectual: either the dumbest smart guy, or the smartest dumb guy. can't say i can tell the difference. i love to talk about ideas and concepts. i forge odd metaphors instead of reciting cliches. like the simularities between a friend of mine's house and an underwater salt mine. my favorite word is salt by the way (weird choice i know). to me most things in life are better as metaphors. i seek to 
make myself a little better everyday, in some productively lazy way. got tired of tying my shoes. considered hiring a five year old, but would probably have to tie both of our shoes... decided to only wear leather shoes dress shoes.  about you:  you love to have really serious, really deep conversations about really silly stuff. you have to be willing to snap me out of a light hearted rant with a kiss. you don't have to be funny, but you have to be able to make me laugh. you should be able to bend spoons with your 
mind, and telepathically make me smile while i am still at work. you should love life, and be cool with just letting the wind blow. extra points for reading all this and guessing my favorite video game (no hints given yet). and lastly you have a good attention span.",
    "i am a chef: this is what that means. 1. i am a workaholic. 2. i love to cook regardless of whether i am at work. 3. i love to drink and eat foods that are probably really bad for me. 4. i love being around people that resemble line 1-3. i love the outdoors and i am an avid skier. if its snowing i will be in tahoe at the very least. i am a very confident and friendly. i'm not interested in acting or being a typical guy. i have no time or patience for rediculous acts of territorial pissing. overall i am a very 
likable easygoing individual. i am very adventurous and always looking forward to doing new things and hopefully sharing it with the right person.",
    "i'm not ashamed of much, but writing public text on an online dating site makes me pleasantly uncomfortable. i'll try to be as earnest as possible in the noble endeavor of standing naked before the world.  i've lived in san francisco for 15 years, and both love it and find myself frustrated with its deficits. lots of great friends and acquaintances (which increases my apprehension to put anything on this site), but i'm feeling like meeting some new people that aren't just friends of friends. it's okay if you are a friend of a friend too. chances are, if you make it through the complex filtering process of multiple choice questions, lifestyle statistics, photo scanning, and these indulgent blurbs of text without moving quickly on to another search result, you are probably already a cultural peer and at most 2 people removed. at first, i thought i should say as little as possible here to avoid 
you, but that seems silly.  as far as culture goes, i'm definitely more on the weird side of the spectrum, but i don't exactly wear it on my sleeve. once you get me talking, it will probably become increasingly apparent that while i'd like to think of myself as just like everybody else (and by some definition i certainly am), most people don't see me that way. that's fine with me. most of the people i find myself gravitating towards are pretty weird themselves. you probably are too.",
    "i work in a library and go to school. . .", "hey how's it going? currently vague on the profile i know, more to come soon. looking to meet new folks outside of my circle of friends. i'm pretty responsive on the reply tip, feel free to drop a line. cheers.",
    "i'm an australian living in san francisco, but don't hold that against me. i spend most of my days trying to build cool stuff for my company. i speak mandarin and have been known to bust out chinese songs at karaoke. i'm pretty cheeky. someone asked me if that meant something about my arse, which i find really funny.  i'm a little oddball. i have a wild imagination; i like to think 
of the most improbable reasons people are doing things just for fun. i love to laugh and look for reasons to do so. occasionally this gets me in trouble because people think i'm laughing at them. sometimes i am, but more often i'm only laughing at myself.  i'm an entrepreneur (like everyone else in sf, it seems) and i love what i do. i enjoy parties and downtime in equal measure. intelligence really turns me on and i love people who can teach me new things."
    ), essay1 = c("currently working as an international agent for a freight forwarding company. import, export, domestic you know the works. online classes and trying to better myself in my free time. perhaps a hours worth of a good book or a video game on a 
lazy sunday.",
    "dedicating everyday to being an unbelievable badass.", "i make nerdy software for musicians, artists, and experimenters to indulge in their own weirdness, but i like to spend time away from the computer when working on my artwork (which is typically more 
concerned with group dynamics and communication, than with visual form, objects, or technology). i also record and deejay dance, noise, pop, and experimental music (most of which electronic or at least studio based). besides these relatively ego driven activities, i've been enjoying things like meditation and tai chi to try and gently flirt with ego death.",
    "reading things written by old dead people", "work work work work   play",
    "building awesome stuff. figuring out what's important. having adventures. looking for treasure."
    ), essay2 = c("making people laugh. ranting about a good salting. finding simplicity in complexity, and complexity in simplicity.",
    "being silly. having ridiculous amonts of fun wherever. being a smart ass. ohh and i can cook. ;)",
    "improvising in different contexts. alternating between being present and decidedly outside of a moment, or trying to hold both at once. rambling intellectual conversations that hold said conversations in contempt while seeking to find something that transcends them. being critical while remaining generous. listening to and using body language--often performed in caricature or large 
gestures, if not outright interpretive dance. dry, dark, and raunchy humor.", 
    "playing synthesizers and organizing books according to the library of congress classification system",
    "creating imagery to look at: http://bagsbrown.blogspot.com/ http://stayruly.blogspot.com/",
    "imagining random shit. laughing at aforementioned random shit. being goofy. articulating what i think and feel. convincing people i'm right. admitting when i'm wrong.  i'm also pretty good at helping people think through problems; my friends say i give good advice. and when i don't have a clue how to help, i will say: i give pretty good hug."
    ), essay3 = c("the way i look. i am a six foot half asian, half caucasian mutt. it makes it tough not to notice me, and for me to blend in.",
    "", "my large jaw and large glasses are the physical things people comment on the most. when sufficiently stimulated, i have an unmistakable cackle of a laugh. after that, it goes in more directions than i care to describe right now. maybe i'll come back to this.",
    "socially awkward but i do my best", "i smile a lot and my inquisitive nature", 
    "i have a big smile. i also get asked if i'm wearing blue-coloured contacts (no)."
    ), essay4 = c("books: absurdistan, the republic, of mice and men (only book that made me want to cry), catcher in the rye, the prince.  movies: gladiator, operation valkyrie, the producers, down periscope.  shows: the borgia, arrested development, game of 
thrones, monty python  music: aesop rock, hail mary mallon, george thorogood and the delaware destroyers, felt  food: i'm down for anything.",
    "i am die hard christopher moore fan. i don't really watch a lot of tv unless there is humor involved. i am kind of stuck on 90's alternative music. i am pretty much a fan of everything though... i do need to draw a line at most types of electronica.",    
    "okay this is where the cultural matrix gets so specific, it's like being in the crosshairs.  for what it's worth, i find myself reading more non-fiction than fiction. it's usually some kind of philosophy, art, or science text by silly authors such as ranciere, de certeau, bataille, baudrillard, butler, stein, arendt, nietzche, zizek, etc. i'll often throw in some weird new age or pop-psychology book in the mix as well. as for fiction, i enjoy what little i've read of eco, perec, wallace, bolao, dick, vonnegut, atwood, delilo, etc. when i was young, i was a rabid asimov reader.  directors i find myself drawn to are makavejev, kuchar, jodorowsky, herzog, hara, klein, waters, verhoeven, ackerman, hitchcock, lang, gorin, goddard, miike, ohbayashi, tarkovsky, sokurov, warhol, etc. but i also like a good amount of \"trashy\" stuff. too much to name.  i definitely enjoy the character development that happens in long form episodic television over the course of 10-100 episodes, which a 1-2hr movie usually can't compete with. some of my recent tv favorites are: breaking bad, the wire, dexter, true blood, the prisoner, lost, fringe.  a smattered sampling of 
the vast field of music i like and deejay: art ensemble, sun ra, evan parker, lil wayne, dj funk, mr. fingers, maurizio, rob hood, dan bell, james blake, nonesuch recordings, omar souleyman, ethiopiques, fela kuti, john cage, meredith monk, robert ashley, terry riley, yoko ono, merzbow, tom tom club, jit, juke, bounce, hyphy, snap, crunk, b'more, kuduro, pop, noise, jazz, techno, house, 
acid, new/no wave, (post)punk, etc.  a few of the famous art/dance/theater folk that might locate my sensibility: andy warhol, bruce nauman, yayoi kusama, louise bourgeois, tino sehgal, george kuchar, michel duchamp, marina abramovic, gelatin, carolee schneeman, gustav metzger, mike kelly, mike smith, andrea fraser, gordon matta-clark, jerzy grotowski, samuel beckett, antonin artaud, tadeusz kantor, anna halperin, merce cunningham, etc. i'm clearly leaving out a younger generation of contemporary artists, many of whom are friends.  local food regulars: sushi zone, chow, ppq, pagolac, lers ros, burma superstar, minako, shalimar, delfina pizza, rosamunde, arinells, suppenkuche, cha-ya, blue plate, golden era, etc.",
    "bataille, celine, beckett. . . lynch, jarmusch, r.w. fassbender. . . twin peaks & fishing w/ john joy division, throbbing gristle, cabaret voltaire. . . vegetarian pho and coffee",
    "music: bands, rappers, musicians at the moment: thee oh sees. forever: wu-tang books: artbooks for days audiobooks: my collection, thick (thanks audible) shows: live ones food: with stellar friends whenever movies > tv podcast: radiolab, this american life, the moth, joe rogan, the champs",
    "books: to kill a mockingbird, lord of the rings, 1984, the farseer trilogy.  music: the beatles, frank sinatra, john mayer, jason mraz, deadmau5, andrew bayer, everything on anjunadeep records, bach, satie.  tv shows: how i met your mother, scrubs, the west wing, breaking bad.  movies: star wars, the godfather pt ii, 500 days of summer, napoleon dynamite, american beauty, lotr  food: thai, vietnamese, shanghai dumplings, pizza!"
    ), essay5 = c("food. water. cell phone. shelter.", "delicious porkness in all of its glories. my big ass doughboy's sinking into 15 new inches. my overly resilient liver. a good sharp knife. my ps3... it plays blurays too. ;) my over the top energy and my 
outlook on life... just give me a bag of lemons and see what happens. ;)",
    "movement conversation creation contemplation touch humor",
    "", "", "like everyone else, i love my friends and family, and need hugs, human contact, water and sunshine. let's take that as given.  1. something to build 2. something to sing 3. something to play on (my guitar would be first choice) 4. something to write/draw on 5. a big goal worth dreaming about 6. something to laugh at"
    ), essay6 = c("duality and humorous things", "", "", "cats and german philosophy",
    "", "what my contribution to the world is going to be and/or should be. and what's for breakfast. i love breakfast."
    ), essay7 = c("trying to find someone to hang out with. i am down for anything except a club.", 
    "", "viewing. listening. dancing. talking. drinking. performing.",
    "", "", "out with my friends!"), essay8 = c("i am new to california and looking for someone to wisper my secrets to.",        
    "i am very open and will share just about anything.", "when i was five years old, i was known as \"the boogerman\".",
    "", "", "i cried on my first day at school because a bird shat on my head. true story."
    ), essay9 = c("you want to be swept off your feet! you are tired of the norm. you want to catch a coffee or a bite. or if you 
want to talk philosophy.",
    "", "you are bright, open, intense, silly, ironic, critical, caring, generous, looking for an exploration, rather than finding \"a match\" of some predetermined qualities.  i'm currently in a fabulous and open relationship, so you should be comfortable with that.",
    "you feel so inclined.", "", "you're awesome.")), row.names = c(NA,
6L), class = "data.frame")

CodePudding user response:

Use across() to apply the same function to multiple columns:

cupid %>% 
  mutate(across(starts_with("essay"), \(x) stringr::str_count(x, "  ")   1,
                .names = "{.col}_num"))
# ...other column...
#  essay0_num essay1_num essay2_num essay3_num essay4_num essay5_num essay6_num essay7_num
# 1        237         45         16         28         62          5          4         16
# 2        130          7         18          1         50         53          1          1
# 3        246         90         65         46        355          6          1          6
# 4         11          7         13          7         29          1          4          1
# 5         40          6          7          8         44          1          1          1
# 6        160         12         60         15         70         59         20          4
#   essay8_num essay9_num
# 1         14         30
# 2         10          1
# 3         12         39
# 4          1          4
# 5          1          1
# 6         17          2

I simplified your word counting logic - splitting on spaces and looking at the length is the same as counting the spaces and adding 1. Using " " as a regex pattern means consecutive spaces will be lumped together.

  • Related