Home > Software design >  How to get the count of the given word in a column? using postgresql
How to get the count of the given word in a column? using postgresql

Time:09-06

I am new to PostgreSql, I am having a table like below,

enter image description here

I need to get output as count of the given word (test) in text_col like below table using ts_vector function, Please give me some suggestion, Thanks !.

enter image description here

CodePudding user response:

What about counting spaces?

The following takes the length of the original string, calculates the length of the string removing the spaces and then sums 1

with first_sel as (
    select 'Ciao io sono Ugo' mytext union all
    select 'Ciao io non sono Ugo' 
)
select mytext, length(mytext) - length(replace(mytext, ' ',''))   1 nr_words from first_sel;

result

        mytext        | nr_words
---------------------- ----------
 Ciao io sono Ugo     |        4
 Ciao io non sono Ugo |        5
(2 rows)

If it's not only about spaces you can try removing the needed characters with regex_replace

CodePudding user response:

SELECT filename, cardinality(string_to_array(text_col,'test'))-1 
FROM documents;

dbfiddle

CodePudding user response:

Using tsvector is like (globally)

select * from ts_stat('select to_tsvector(text_col) from document_table')
where word='test';

where ndoc is number of rows (documents) and nentry number of occurrences of the word

Or by row

select file_name, text_col, to_tsvector(text_col), 
substring(to_tsvector(text_col)::text, '''test'':(\d (,\d )*)') positions_in_text,
length(regexp_replace(substring(to_tsvector(text_col)::text, '''test'':(\d (,\d )*)'), '[^,]', '', 'g')) 1 occurrences 
from document_table
where text_col @@ to_tsquery('test');
  • Related