I'm hoping to use google sheets to determine the profitability of each word in around 100 10-25 word messages.
Right now I'm starting with something along the lines of this:
Message | Success | Word Count |
---|---|---|
Cats are gross | 150% | 3 |
I love Dogs | 250% | 3 |
I love Dogs and Cats are gross | 450% | 6 |
To query the word usage I'm using the formula:
=ArrayFormula(query(Flatten(split(trim(REGEXREPLACE(LOWER(Message1:Message), "[!?,.:;-]"," "))," ")), "SELECT Col1, COUNT(Col1) Group By Col1",0))
This gets me to here:
Word | Count |
---|---|
cats | 2 |
are | 2 |
gross | 2 |
i | 2 |
love | 2 |
dogs | 2 |
and | 1 |
And I'm using this formula to assign the Success% to each word using the word count as its basis for splitting the words and copying down:
=IFERROR(IF(COUNTIF(Value$1:Value2,Value2)<=VLOOKUP(Value2,$Success$2:$Word_Count$100,6,FALSE),Value2,IF(OFFSET($Success$2,MATCH(Value2,$Success$2:$Success$100,0),0)="","",OFFSET($Success$2,MATCH(Value2,$Success$2:$Success$100,0),0))),"")
This gets me here:
Word | Value |
---|---|
cats | 150% |
are | 150% |
gross | 150% |
I | 250% |
love | 250% |
dogs | 250% |
I | 450% |
love | 450% |
dogs | 450% |
and | 450% |
cats | 450% |
are | 450% |
gross | 450% |
What formula could I use to combine these data sets to get me to here:
Word | Count | Success AVG |
---|---|---|
cats | 2 | 300% |
are | 2 | 300% |
gross | 2 | 300% |
I | 2 | 350% |
love | 2 | 350% |
dogs | 2 | 350% |
and | 1 | 450% |
CodePudding user response:
use:
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(SPLIT(REGEXREPLACE(LOWER(A2:A),
"[!?,.:;-]", " "), " ")&"×"&B2:B), "×"),
"select Col1,count(Col1),avg(Col2)
where Col2 is not null
group by Col1
label Col1'Word',count(Col1)'Count',avg(Col2)'Success AVG'"))