I am new to PostgreSql, I am having a table like below,
I need to get output as count of the given word (test) in text_col like below table using ts_vector function, Please give me some suggestion, Thanks !.
CodePudding user response:
What about counting spaces?
The following takes the length of the original string, calculates the length of the string removing the spaces and then sums 1
with first_sel as (
select 'Ciao io sono Ugo' mytext union all
select 'Ciao io non sono Ugo'
)
select mytext, length(mytext) - length(replace(mytext, ' ','')) 1 nr_words from first_sel;
result
mytext | nr_words
---------------------- ----------
Ciao io sono Ugo | 4
Ciao io non sono Ugo | 5
(2 rows)
If it's not only about spaces you can try removing the needed characters with regex_replace
CodePudding user response:
SELECT filename, cardinality(string_to_array(text_col,'test'))-1
FROM documents;
CodePudding user response:
Using tsvector is like (globally)
select * from ts_stat('select to_tsvector(text_col) from document_table')
where word='test';
where ndoc is number of rows (documents) and nentry number of occurrences of the word
Or by row
select file_name, text_col, to_tsvector(text_col),
substring(to_tsvector(text_col)::text, '''test'':(\d (,\d )*)') positions_in_text,
length(regexp_replace(substring(to_tsvector(text_col)::text, '''test'':(\d (,\d )*)'), '[^,]', '', 'g')) 1 occurrences
from document_table
where text_col @@ to_tsquery('test');