Home > Software design >  Combine a SPLIT formula with a formula that chooses N unique words from the SPLIT outcome
Combine a SPLIT formula with a formula that chooses N unique words from the SPLIT outcome

Time:07-19

I got a sentence which I SPLIT into words without the punctuation. Next I want to choose three random, but unique words from that split. I use the formula as seen in cell I2. Is it possible to combine both the SPLIT formula and the other formula into one (big) formula?

enter image description here

SPLIT formula:

=ARRAYFORMULA(REGEXREPLACE(SPLIT(A2," "),"[,.?!]",""))

Formula to choose three random unique words:

=ARRAYFORMULA(ARRAY_CONSTRAIN(SPLIT(FLATTEN(QUERY(QUERY(QUERY(SPLIT(FLATTEN( 
ROW(B2:G2)&"×"&RANDARRAY(ROWS(B2:G2), COLUMNS(B2:G2))&"×"&B2:G2), "×"),
"select max(Col3) group by Col2 pivot Col1"),
 "offset 1", 0),,9^9)), " "), 9^9, 3))

CodePudding user response:

I understand that you want to get 3 random unique words from a string.
in what follows i am going to demonstrate how get truly random words when the sheet is modified plus handling exceptions, ponctuation and more, like this take a look at Top n unique random words

Paste this formula in B2.

=ArrayFormula(IF(A2="",,JOIN(" ,",TRANSPOSE(QUERY(SORTN({RANDARRAY(COUNTA(UNIQUE(SPLIT(TRIM(REGEXREPLACE(A2,"[[:punct:]]",""))," ")))),TRANSPOSE(UNIQUE(SPLIT(TRIM(REGEXREPLACE(A2,"[[:punct:]]",""))," ")))},3,,1,RANDBETWEEN(0,1))," Select Col2 ")))))

Explanation: Pending...

1 - We need UNIQUE(SPLIT(TRIM(REGEXREPLACE(A2,"[[:punct:]]",""))," ")) to rplace punctuation with nothing "" and TRIM spaces in start, tailing and additional spaces, SPLIT the string with " " as a delimiter, and then get the UNIQUE columns resulted from SPLIT, which is
He|is|cunning|as|a|fox and TRANSPOSE the output like this TRANSPOSE(UNIQUE([Output])
to join it with random numbers column later.

2 - we need an Array {} that contain He|is|cunning|as|a|fox and column with random numbers , like this { RANDARRAY , He|is|cunning|as|a|fox }.

To get the column with random numbers: RANDARRAY(COUNTA(UNIQUE(SPLIT(TRIM(REGEXREPLACE(A2,"[[:punct:]]",""))," "))))

RANDARRAY takes [columns] set to 1 and [rows] set to COUNTA(UNIQUE(SPLIT(TRIM(REGEXREPLACE(A2,"[[:punct:]]",""))," "))) which is the COUNTA( He|is|cunning|as|a|fox )

3 - Now we have to SORTN the output with [n] set to 3 meaning 3 words in this case
"to get N unique random words" just replace [n] with a cell refrence.
[sort_column] set to 1 the column of random number and [is_ascending] set to RANDBETWEEN(0,1) to get either 0 or 1, [is_ascending] 0 means Flase it sort's Descending , 1 means True sort ascending.

4 - QUERY " Select Col2 ", the randomized column of words.

5 - TRANSPOSE the column.

6 - JOIN with " ,"

CodePudding user response:

After researching for a while I came across the use of array_constrain to pick a fixed number of results and sort with randarray to randomize the outcome.

=ARRAY_CONSTRAIN(
transpose(SORT(transpose(ARRAYFORMULA(REGEXREPLACE(SPLIT(A2," "),"[,.?!]",""))),
randarray(COUNTA(ARRAYFORMULA(REGEXREPLACE(SPLIT(A2," "),"[,.?!]","")))),true))
,1,3)

If anyone happens to have a better solution to this, I would gladly see a response.

  • Related