Home > Net >  How do I average the success metric of each word in message for over a hundred messages
How do I average the success metric of each word in message for over a hundred messages

Time:04-29

I'm hoping to use google sheets to determine the profitability of each word in around 100 10-25 word messages.

Right now I'm starting with something along the lines of this:

Message Success Word Count
Cats are gross 150% 3
I love Dogs 250% 3
I love Dogs and Cats are gross 450% 6

To query the word usage I'm using the formula:

=ArrayFormula(query(Flatten(split(trim(REGEXREPLACE(LOWER(Message1:Message), "[!?,.:;-]"," "))," ")), "SELECT Col1, COUNT(Col1) Group By Col1",0))

This gets me to here:

Word Count
cats 2
are 2
gross 2
i 2
love 2
dogs 2
and 1

And I'm using this formula to assign the Success% to each word using the word count as its basis for splitting the words and copying down:

=IFERROR(IF(COUNTIF(Value$1:Value2,Value2)<=VLOOKUP(Value2,$Success$2:$Word_Count$100,6,FALSE),Value2,IF(OFFSET($Success$2,MATCH(Value2,$Success$2:$Success$100,0),0)="","",OFFSET($Success$2,MATCH(Value2,$Success$2:$Success$100,0),0))),"")

This gets me here:

Word Value
cats 150%
are 150%
gross 150%
I 250%
love 250%
dogs 250%
I 450%
love 450%
dogs 450%
and 450%
cats 450%
are 450%
gross 450%

What formula could I use to combine these data sets to get me to here:

Word Count Success AVG
cats 2 300%
are 2 300%
gross 2 300%
I 2 350%
love 2 350%
dogs 2 350%
and 1 450%

CodePudding user response:

use:

=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(SPLIT(REGEXREPLACE(LOWER(A2:A), 
 "[!?,.:;-]", " "), " ")&"×"&B2:B), "×"), 
 "select Col1,count(Col1),avg(Col2)
  where Col2 is not null 
  group by Col1 
  label Col1'Word',count(Col1)'Count',avg(Col2)'Success AVG'"))

enter image description here


enter image description here

  • Related