How to get the count of the given word in a column? using postgresql-CodePudding

I am new to PostgreSql, I am having a table like below,

I need to get output as count of the given word (test) in text_col like below table using ts_vector function, Please give me some suggestion, Thanks !.

CodePudding user response：

What about counting spaces?

The following takes the length of the original string, calculates the length of the string removing the spaces and then sums 1

with first_sel as (
    select 'Ciao io sono Ugo' mytext union all
    select 'Ciao io non sono Ugo' 
)
select mytext, length(mytext) - length(replace(mytext, ' ',''))   1 nr_words from first_sel;

result

        mytext        | nr_words
---------------------- ----------
 Ciao io sono Ugo     |        4
 Ciao io non sono Ugo |        5
(2 rows)

If it's not only about spaces you can try removing the needed characters with regex_replace

CodePudding user response：

SELECT filename, cardinality(string_to_array(text_col,'test'))-1 
FROM documents;

dbfiddle

CodePudding user response：

Using tsvector is like (globally)

select * from ts_stat('select to_tsvector(text_col) from document_table')
where word='test';

where ndoc is number of rows (documents) and nentry number of occurrences of the word

Or by row

select file_name, text_col, to_tsvector(text_col), 
substring(to_tsvector(text_col)::text, '''test'':(\d (,\d )*)') positions_in_text,
length(regexp_replace(substring(to_tsvector(text_col)::text, '''test'':(\d (,\d )*)'), '[^,]', '', 'g')) 1 occurrences 
from document_table
where text_col @@ to_tsquery('test');