阅读(3522) 赞(11)

PostgreSQL 文本搜索函数和操作符

2021-08-17 15:07:54 更新

表 9.41, 表 9.42 and 表 9.43总结了为全文搜索提供的函数和操作符。PostgreSQL的文本搜索功能的详细解释可参考第 12 章。

表 9.41. 文本搜索操作符

操作符描述例子
`tsvector` `@@` `tsquery` → `boolean` `tsquery` `@@` `tsvector` → `boolean` `tsvector`匹配`tsquery`吗?(参数可以按任意顺序给出。) `to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat')` → `t`
`text` `@@` `tsquery` → `boolean` 隐式调用`to_tsvector()`后的文本字符串匹配`tsquery`么 ? `'fat cats ate rats' @@ to_tsquery('cat & rat')` → `t`
`tsvector` `@@@` `tsquery` → `boolean` `tsquery` `@@@` `tsvector` → `boolean` 这是`@@`已弃用的同义词。 `to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat')` → `t`
`tsvector` `\|\|` `tsvector` → `tsvector` 连接两个`tsvector`。如果两个输入都包含词素位置，则相应地调整第二个输入的位置。 `'a:1 b:2'::tsvector \|\| 'c:1 d:2 b:3'::tsvector` → `'a':1 'b':2,5 'c':3 'd':4`
`tsquery` `&&` `tsquery` → `tsquery` ANDs两个`tsquery`一起，生成一个匹配两个输入查询的匹配文档的查询。 `'fat \| rat'::tsquery && 'cat'::tsquery` → `( 'fat' \| 'rat' ) & 'cat'`
`tsquery` `\|\|` `tsquery` → `tsquery` ORs两个`tsquery`一起，生成一个匹配两个输入查询的匹配文档的查询。 `'fat \| rat'::tsquery \|\| 'cat'::tsquery` → `'fat' \| 'rat' \| 'cat'`
`!!` `tsquery` → `tsquery` 否定`tsquery`，生成一个与输入查询不匹配的匹配文档的查询。 `!! 'cat'::tsquery` → `!'cat'`
`tsquery` `<->` `tsquery` → `tsquery` 构造一个短语查询，如果两个输入查询在连续的词素上匹配，该查询将进行匹配。 `to_tsquery('fat') <-> to_tsquery('rat')` → `'fat' <-> 'rat'`
`tsquery` `@>` `tsquery` → `boolean` 第一个`tsquery`包含了第二个吗?(这只考虑出现在一个查询中的所有词素是否出现在另一个查询中，忽略了组合操作符。) `'cat'::tsquery @> 'cat & rat'::tsquery` → `f`
`tsquery` `<@` `tsquery` → `boolean` 第一个`tsquery`包含在第二个中吗?(这只考虑出现在一个查询中的所有词素是否出现在另一个查询中，而忽略了组合操作符。) `'cat'::tsquery <@ 'cat & rat'::tsquery` → `t` `'cat'::tsquery <@ '!cat & rat'::tsquery` → `t`

操作符

描述

例子

tsvector @@ tsquery → boolean

tsquery @@ tsvector → boolean

tsvector匹配tsquery吗?(参数可以按任意顺序给出。)

to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat') → t

text @@ tsquery → boolean

隐式调用to_tsvector()后的文本字符串匹配tsquery么 ?

'fat cats ate rats' @@ to_tsquery('cat & rat') → t

tsvector @@@ tsquery → boolean

tsquery @@@ tsvector → boolean

这是@@已弃用的同义词。

to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat') → t

tsvector || tsvector → tsvector

连接两个tsvector。如果两个输入都包含词素位置，则相应地调整第二个输入的位置。

'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector → 'a':1 'b':2,5 'c':3 'd':4

tsquery && tsquery → tsquery

ANDs两个tsquery一起，生成一个匹配两个输入查询的匹配文档的查询。

'fat | rat'::tsquery && 'cat'::tsquery → ( 'fat' | 'rat' ) & 'cat'

tsquery || tsquery → tsquery

ORs两个tsquery一起，生成一个匹配两个输入查询的匹配文档的查询。

'fat | rat'::tsquery || 'cat'::tsquery → 'fat' | 'rat' | 'cat'

!! tsquery → tsquery

否定tsquery，生成一个与输入查询不匹配的匹配文档的查询。

!! 'cat'::tsquery → !'cat'

tsquery <-> tsquery → tsquery

构造一个短语查询，如果两个输入查询在连续的词素上匹配，该查询将进行匹配。

to_tsquery('fat') <-> to_tsquery('rat') → 'fat' <-> 'rat'

tsquery @> tsquery → boolean

第一个tsquery包含了第二个吗?(这只考虑出现在一个查询中的所有词素是否出现在另一个查询中，忽略了组合操作符。)

'cat'::tsquery @> 'cat & rat'::tsquery → f

tsquery <@ tsquery → boolean

第一个tsquery包含在第二个中吗?(这只考虑出现在一个查询中的所有词素是否出现在另一个查询中，而忽略了组合操作符。)

'cat'::tsquery <@ 'cat & rat'::tsquery → t

'cat'::tsquery <@ '!cat & rat'::tsquery → t

除了这些专用操作符之外，表 9.1 中所示的常用比较操作符也适用于tsvector和tsquery类型。它们对于文本搜索不是很有用，但是允许使用。例如，建在这些类型列上的唯一索引。

表 9.42. 文本搜索函数

函数描述例子
`array_to_tsvector` ( `text[]` ) → `tsvector` 将词素数组转换为`tsvector`。给定的字符串按原样使用，不做进一步处理。 `array_to_tsvector('{fat,cat,rat}'::text[])` → `'cat' 'fat' 'rat'`
`get_current_ts_config` ( ) → `regconfig` 返回当前默认文本搜索配置的OID(由 default_text_search_config 所设定的). `get_current_ts_config()` → `english`
`length` ( `tsvector` ) → `integer` 返回`tsvector`中的词位数。 `length('fat:2,4 cat:3 rat:5A'::tsvector)` → `3`
`numnode` ( `tsquery` ) → `integer` 返回`tsquery`中词位和操作符的数目。 `numnode('(fat & rat) \| cat'::tsquery)` → `5`
`plainto_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为`tsquery`，根据指定的或默认配置对单词进行标准化。字符串中的任何标点符号都会被忽略(它不决定查询操作符)。结果查询匹配文本中包含所有非停止词的文档。 `plainto_tsquery('english', 'The Fat Rats')` → `'fat' & 'rat'`
`phraseto_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为`tsquery`，根据指定的或默认配置对单词进行标准化。字符串中的任何标点符号都会被忽略(它不决定查询操作符)。结果查询匹配包含文本中所有非停止词的短语。 `phraseto_tsquery('english', 'The Fat Rats')` → `'fat' <-> 'rat'` `phraseto_tsquery('english', 'The Cat and Rats')` → `'cat' <2> 'rat'`
`websearch_to_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为`tsquery`，根据指定的或默认配置对单词进行标准化。引用的单词序列被转换为短语测试。 “or”一词被理解为产生OR操作符，而破折号产生NOT操作符;其他标点符号被忽略。这类似于一些常见的网络搜索工具的行为。 `websearch_to_tsquery('english', '"fat rat" or cat dog')` → `'fat' <-> 'rat' \| 'cat' & 'dog'`
`querytree` ( `tsquery` ) → `text` 生成`tsquery`的可转位部分的表示。结果为空或仅为`T`表示不可索引查询。 `querytree('foo & ! bar'::tsquery)` → `'foo'`
`setweight` ( `vector` `tsvector`, `weight` `"char"` ) → `tsvector` 将指定的`weight`赋给`vector`的每个元素。 `setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A')` → `'cat':3A 'fat':2A,4A 'rat':5A`
`setweight` ( `vector` `tsvector`, `weight` `"char"`, `lexemes` `text[]`) → `tsvector` 将指定的`weight`赋给列在`lexemes`中的`vector`元素。 `setweight('fat:2,4 cat:3 rat:5,6B'::tsvector, 'A', '{cat,rat}')` → `'cat':3A 'fat':2,4 'rat':5A,6A`
`strip` ( `tsvector` ) → `tsvector` 从`tsvector`中移除位置和权重。 `strip('fat:2,4 cat:3 rat:5A'::tsvector)` → `'cat' 'fat' 'rat'`
`to_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为`tsquery`，根据指定的或默认配置对单词进行标准化。单词必须由有效的`tsquery`操作符组合。 `to_tsquery('english', 'The & Fat & Rats')` → `'fat' & 'rat'`
`to_tsvector` ( [ `config` `regconfig`, ] `document` `text` ) → `tsvector` 将文本转换为`tsvector`，根据指定的或默认配置对单词进行标准化。结果中包含位置信息。 `to_tsvector('english', 'The Fat Rats')` → `'fat':2 'rat':3`
`to_tsvector` ( [ `config` `regconfig`, ] `document` `json` ) → `tsvector` `to_tsvector` ( [ `config` `regconfig`, ] `document` `jsonb` ) → `tsvector` 将JSON文档中的每个字符串值转换为`tsvector`，根据指定的或默认配置对单词进行标准化。然后将结果按文档顺序连接起来以产生输出。位置信息就像在每对字符串值之间存在一个停止词一样生成。 (注意，当输入为`jsonb`时，JSON对象的字段的“document order”取决于实现;请观察这些例子中的差异。) `to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::json)` → `'dog':5 'fat':2 'rat':3` `to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::jsonb)` → `'dog':1 'fat':4 'rat':5`
`json_to_tsvector` ( [ `config` `regconfig`, ] `document` `json`, `filter` `jsonb` ) → `tsvector` `jsonb_to_tsvector` ( [ `config` `regconfig`, ] `document` `jsonb`, `filter` `jsonb` ) → `tsvector` 选择`filter`请求的JSON文档中的每个项，并将每个项转换为`tsvector`，根据指定的或默认配置对单词进行标准化。然后将结果按文档顺序连接起来以产生输出。位置信息就像在每对选定的项目之间存在一个停止词一样生成。 (注意，当输入为`jsonb`时，JSON对象字段的“document order”取决于实现。) `filter` 必须是一个`jsonb`数组，其中包含0个或多个关键字: `"string"`(包括所有字符串值)， `"numeric"`(包括所有数值)， `"boolean"`(包括所有布尔值)， `"key"`(包括所有键)，或 `"all"`(包括以上所有关键字)。作为一种特殊情况，该`filter`也可以是这些关键字之一的简单JSON值。 `json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]')` → `'123':5 'fat':2 'rat':3` `json_to_tsvector('english', '{"cat": "The Fat Rats", "dog": 123}'::json, '"all"')` → `'123':9 'cat':1 'dog':7 'fat':4 'rat':5`
`ts_delete` ( `vector` `tsvector`, `lexeme` `text` ) → `tsvector` 从`vector`中删除任何出现的给定`lexeme`。 `ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')` → `'cat':3 'rat':5A`
`ts_delete` ( `vector` `tsvector`, `lexemes` `text[]` ) → `tsvector` 从`vector`中删除`lexemes`中出现的任何词位。 `ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])` → `'cat':3`
`ts_filter` ( `vector` `tsvector`, `weights` `"char"[]` ) → `tsvector` 只从`vector`中选择具有给定`weights`的元素。 `ts_filter('fat:2,4 cat:3b,7c rat:5A'::tsvector, '{a,b}')` → `'cat':3B 'rat':5A`
`ts_headline` ( [ `config` `regconfig`, ] `document` `text`, `query` `tsquery`[, `options` `text` ] ) → `text` 以缩写形式显示`document`中`query`的匹配项，该匹配项必须是原始文本，而不是`tsvector`。在匹配查询之前，文档中的单词将根据指定的或默认的配置进行规范化。第 12.3.4 节中讨论了该函数的使用，还描述了可用的`options`。 `ts_headline('The fat cat ate the rat.', 'cat')` → `The fat <b>cat</b> ate the rat.`
`ts_headline` ( [ `config` `regconfig`, ] `document` `json`, `query` `tsquery`[, `options` `text` ] ) → `text` `ts_headline` ( [ `config` `regconfig`, ] `document` `jsonb`, `query` `tsquery`[, `options` `text` ] ) → `text` 以缩写形式显示匹配JSON`document`中字符串值中的`query`。更多细节请参阅第 12.3.4 节。 `ts_headline('{"cat":"raining cats and dogs"}'::jsonb, 'cat')` → `{"cat": "raining <b>cats</b> and dogs"}`
`ts_rank` ( [ `weights` `real[]`, ] `vector` `tsvector`, `query` `tsquery`[, `normalization` `integer` ] ) → `real` 计算一个分数，显示`vector`与`query`的匹配程度。详情请参见第 12.3.3 节。 `ts_rank(to_tsvector('raining cats and dogs'), 'cat')` → `0.06079271`
`ts_rank_cd` ( [ `weights` `real[]`, ] `vector` `tsvector`, `query` `tsquery`[, `normalization` `integer` ] ) → `real` 使用覆盖密度算法计算一个分数，显示`vector`与`query`的匹配程度。详情参见第 12.3.3 节。 `ts_rank_cd(to_tsvector('raining cats and dogs'), 'cat')` → `0.1`
`ts_rewrite` ( `query` `tsquery`, `target` `tsquery`, `substitute` `tsquery`) → `tsquery` 在`query`中使用 `substitute`替换出现的`target`。详情参见第 12.4.2.1 节。 `ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo\|bar'::tsquery)` → `'b' & ( 'foo' \| 'bar' )`
`ts_rewrite` ( `query` `tsquery`, `select` `text` ) → `tsquery` 根据目标替换部分`query`，并替换通过执行`SELECT`命令获得的查询。详情参见第 12.4.2.1 节。 `SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases')` → `'b' & ( 'foo' \| 'bar' )`
`tsquery_phrase` ( `query1` `tsquery`, `query2` `tsquery` ) → `tsquery` 构造一个短语查询，在连续的词位上搜索`query1`和`query2`的匹配项(与`<->`操作符相同)。 `tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))` → `'fat' <-> 'cat'`
`tsquery_phrase` ( `query1` `tsquery`, `query2` `tsquery`, `distance` `integer`) → `tsquery` 构造一个短语查询，用于搜索`query1`和`query2`的匹配项，这些匹配项恰好出现在`distance`词位之间。 `tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)` → `'fat' <10> 'cat'`
`tsvector_to_array` ( `tsvector` ) → `text[]` 将`tsvector`转换为词位的数组。 `tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector)` → `{cat,fat,rat}`
`unnest` ( `tsvector` ) → `setof record` ( `lexeme` `text`, `positions` `smallint[]`, `weights` `text` ) 将`tsvector`展开为一组行，每个行对应一个词位。 `select * from unnest('cat:3 fat:2,4 rat:5A'::tsvector)` → `lexeme \| positions \| weights --------+-----------+--------- cat \| {3} \| {D} fat \| {2,4} \| {D,D} rat \| {5} \| {A}`

函数

描述

例子

array_to_tsvector ( text[] ) → tsvector

将词素数组转换为tsvector。给定的字符串按原样使用，不做进一步处理。

array_to_tsvector('{fat,cat,rat}'::text[]) → 'cat' 'fat' 'rat'

get_current_ts_config ( ) → regconfig

返回当前默认文本搜索配置的OID(由 default_text_search_config 所设定的).

get_current_ts_config() → english

length ( tsvector ) → integer

返回tsvector中的词位数。

length('fat:2,4 cat:3 rat:5A'::tsvector) → 3

numnode ( tsquery ) → integer

返回tsquery中词位和操作符的数目。

numnode('(fat & rat) | cat'::tsquery) → 5

plainto_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为tsquery，根据指定的或默认配置对单词进行标准化。字符串中的任何标点符号都会被忽略(它不决定查询操作符)。结果查询匹配文本中包含所有非停止词的文档。

plainto_tsquery('english', 'The Fat Rats') → 'fat' & 'rat'

phraseto_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为tsquery，根据指定的或默认配置对单词进行标准化。字符串中的任何标点符号都会被忽略(它不决定查询操作符)。结果查询匹配包含文本中所有非停止词的短语。

phraseto_tsquery('english', 'The Fat Rats') → 'fat' <-> 'rat'

phraseto_tsquery('english', 'The Cat and Rats') → 'cat' <2> 'rat'

websearch_to_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为tsquery，根据指定的或默认配置对单词进行标准化。引用的单词序列被转换为短语测试。 “or”一词被理解为产生OR操作符，而破折号产生NOT操作符;其他标点符号被忽略。这类似于一些常见的网络搜索工具的行为。

websearch_to_tsquery('english', '"fat rat" or cat dog') → 'fat' <-> 'rat' | 'cat' & 'dog'

querytree ( tsquery ) → text

生成tsquery的可转位部分的表示。结果为空或仅为T表示不可索引查询。

querytree('foo & ! bar'::tsquery) → 'foo'

setweight ( vector tsvector, weight "char" ) → tsvector

将指定的weight赋给vector的每个元素。

setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A') → 'cat':3A 'fat':2A,4A 'rat':5A

setweight ( vector tsvector, weight "char", lexemes text[]) → tsvector

将指定的weight赋给列在lexemes中的vector元素。

setweight('fat:2,4 cat:3 rat:5,6B'::tsvector, 'A', '{cat,rat}') → 'cat':3A 'fat':2,4 'rat':5A,6A

strip ( tsvector ) → tsvector

从tsvector中移除位置和权重。

strip('fat:2,4 cat:3 rat:5A'::tsvector) → 'cat' 'fat' 'rat'

to_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为tsquery，根据指定的或默认配置对单词进行标准化。单词必须由有效的tsquery操作符组合。

to_tsquery('english', 'The & Fat & Rats') → 'fat' & 'rat'

to_tsvector ( [ config regconfig, ] document text ) → tsvector

将文本转换为tsvector，根据指定的或默认配置对单词进行标准化。结果中包含位置信息。

to_tsvector('english', 'The Fat Rats') → 'fat':2 'rat':3

to_tsvector ( [ config regconfig, ] document json ) → tsvector

to_tsvector ( [ config regconfig, ] document jsonb ) → tsvector

将JSON文档中的每个字符串值转换为tsvector，根据指定的或默认配置对单词进行标准化。然后将结果按文档顺序连接起来以产生输出。位置信息就像在每对字符串值之间存在一个停止词一样生成。 (注意，当输入为jsonb时，JSON对象的字段的“document order”取决于实现;请观察这些例子中的差异。)

to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::json) → 'dog':5 'fat':2 'rat':3

to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::jsonb) → 'dog':1 'fat':4 'rat':5

json_to_tsvector ( [ config regconfig, ] document json, filter jsonb ) → tsvector

jsonb_to_tsvector ( [ config regconfig, ] document jsonb, filter jsonb ) → tsvector

选择filter请求的JSON文档中的每个项，并将每个项转换为tsvector，根据指定的或默认配置对单词进行标准化。然后将结果按文档顺序连接起来以产生输出。位置信息就像在每对选定的项目之间存在一个停止词一样生成。 (注意，当输入为jsonb时，JSON对象字段的“document order”取决于实现。) filter 必须是一个jsonb数组，其中包含0个或多个关键字: "string"(包括所有字符串值)， "numeric"(包括所有数值)， "boolean"(包括所有布尔值)， "key"(包括所有键)，或 "all"(包括以上所有关键字)。作为一种特殊情况，该filter也可以是这些关键字之一的简单JSON值。

json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]') → '123':5 'fat':2 'rat':3

json_to_tsvector('english', '{"cat": "The Fat Rats", "dog": 123}'::json, '"all"') → '123':9 'cat':1 'dog':7 'fat':4 'rat':5

ts_delete ( vector tsvector, lexeme text ) → tsvector

从vector中删除任何出现的给定lexeme。

ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat') → 'cat':3 'rat':5A

ts_delete ( vector tsvector, lexemes text[] ) → tsvector

从vector中删除lexemes中出现的任何词位。

ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat']) → 'cat':3

ts_filter ( vector tsvector, weights "char"[] ) → tsvector

只从vector中选择具有给定weights的元素。

ts_filter('fat:2,4 cat:3b,7c rat:5A'::tsvector, '{a,b}') → 'cat':3B 'rat':5A

ts_headline ( [ config regconfig, ] document text, query tsquery[, options text ] ) → text

以缩写形式显示document中query的匹配项，该匹配项必须是原始文本，而不是tsvector。在匹配查询之前，文档中的单词将根据指定的或默认的配置进行规范化。第 12.3.4 节中讨论了该函数的使用，还描述了可用的options。

ts_headline('The fat cat ate the rat.', 'cat') → The fat <b>cat</b> ate the rat.

ts_headline ( [ config regconfig, ] document json, query tsquery[, options text ] ) → text

ts_headline ( [ config regconfig, ] document jsonb, query tsquery[, options text ] ) → text

以缩写形式显示匹配JSONdocument中字符串值中的query。更多细节请参阅第 12.3.4 节。

ts_headline('{"cat":"raining cats and dogs"}'::jsonb, 'cat') → {"cat": "raining <b>cats</b> and dogs"}

ts_rank ( [ weights real[], ] vector tsvector, query tsquery[, normalization integer ] ) → real

计算一个分数，显示vector与query的匹配程度。详情请参见第 12.3.3 节。

ts_rank(to_tsvector('raining cats and dogs'), 'cat') → 0.06079271

ts_rank_cd ( [ weights real[], ] vector tsvector, query tsquery[, normalization integer ] ) → real

使用覆盖密度算法计算一个分数，显示vector与query的匹配程度。详情参见第 12.3.3 节。

ts_rank_cd(to_tsvector('raining cats and dogs'), 'cat') → 0.1

ts_rewrite ( query tsquery, target tsquery, substitute tsquery) → tsquery

在query中使用 substitute替换出现的target。详情参见第 12.4.2.1 节。

ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery) → 'b' & ( 'foo' | 'bar' )

ts_rewrite ( query tsquery, select text ) → tsquery

根据目标替换部分query，并替换通过执行SELECT命令获得的查询。详情参见第 12.4.2.1 节。

SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases') → 'b' & ( 'foo' | 'bar' )

tsquery_phrase ( query1 tsquery, query2 tsquery ) → tsquery

构造一个短语查询，在连续的词位上搜索query1和query2的匹配项(与<->操作符相同)。

tsquery_phrase(to_tsquery('fat'), to_tsquery('cat')) → 'fat' <-> 'cat'

tsquery_phrase ( query1 tsquery, query2 tsquery, distance integer) → tsquery

构造一个短语查询，用于搜索query1和query2的匹配项，这些匹配项恰好出现在distance词位之间。

tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10) → 'fat' <10> 'cat'

tsvector_to_array ( tsvector ) → text[]

将tsvector转换为词位的数组。

tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector) → {cat,fat,rat}

unnest ( tsvector ) → setof record ( lexeme text, positions smallint[], weights text )

将tsvector展开为一组行，每个行对应一个词位。

select * from unnest('cat:3 fat:2,4 rat:5A'::tsvector) →

 lexeme | positions | weights
--------+-----------+---------
 cat    | {3}       | {D}
 fat    | {2,4}     | {D,D}
 rat    | {5}       | {A}

注意

所有接受一个可选的regconfig参数的文本搜索函数在该参数被忽略时，使用由default_text_search_config指定的配置。

表 9.43中的函数被单独列出，因为它们通常不被用于日常的文本搜索操作。它们主要有助于开发和调试新的文本搜索配置。

表 9.43. 文本搜索调试函数

函数描述例子
`ts_debug` ( [ `config` `regconfig`, ] `document` `text` ) → `setof record` ( `alias` `text`, `description` `text`, `token` `text`, `dictionaries` `regdictionary[]`, `dictionary` `regdictionary`, `lexemes` `text[]` ) 根据指定的或默认的文本搜索配置从`document`中提取和标准化标记，并返回关于每个标记是如何处理的信息。详请参见第 12.8.1 节。 `ts_debug('english', 'The Brightest supernovaes')` → `(asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...`
`ts_lexize` ( `dict` `regdictionary`, `token` `text` ) → `text[]` 如果字典知道输入标记，则返回替换词位数组;如果字典知道标记，但它是停止词，则返回空数组;如果它不是已知词，则返回NULL。详情参见第 12.8.3 节。 `ts_lexize('english_stem', 'stars')` → `{star}`
`ts_parse` ( `parser_name` `text`, `document` `text` ) → `setof record` ( `tokid` `integer`, `token` `text` ) 使用命名的解析器从`document`中提取标记。详情参见第 12.8.2 节。 `ts_parse('default', 'foo - bar')` → `(1,foo) ...`
`ts_parse` ( `parser_oid` `oid`, `document` `text` ) → `setof record` ( `tokid` `integer`, `token` `text` ) 使用OID指定的解析器从`document`中提取标记。详请参见第 12.8.2 节。 `ts_parse(3722, 'foo - bar')` → `(1,foo) ...`
`ts_token_type` ( `parser_name` `text` ) → `setof record` ( `tokid` `integer`, `alias` `text`, `description` `text` ) 返回一个表，该表描述命名解析器可以识别的每种类型的标记。详请参见第 12.8.2 节。 `ts_token_type('default')` → `(1,asciiword,"Word, all ASCII") ...`
`ts_token_type` ( `parser_oid` `oid` ) → `setof record` ( `tokid` `integer`, `alias` `text`, `description` `text` ) 返回一个表，该表描述OID指定的解析器可以识别的每种标记类型。详请参见第 12.8.2 节。 `ts_token_type(3722)` → `(1,asciiword,"Word, all ASCII") ...`
`ts_stat` ( `sqlquery` `text` [, `weights` `text` ] ) → `setof record` ( `word` `text`, `ndoc` `integer`, `nentry` `integer` ) 执行`sqlquery`，该查询必须返回单个`tsvector`列，并返回关于数据中包含的每个不同词位的统计信息。详请参见第 12.4.4 节。 `ts_stat('SELECT vector FROM apod')` → `(foo,10,15) ...`

函数

描述

例子

ts_debug ( [ config regconfig, ] document text ) → setof record ( alias text, description text, token text, dictionaries regdictionary[], dictionary regdictionary, lexemes text[] )

根据指定的或默认的文本搜索配置从document中提取和标准化标记，并返回关于每个标记是如何处理的信息。详请参见第 12.8.1 节。

ts_debug('english', 'The Brightest supernovaes') → (asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...

ts_lexize ( dict regdictionary, token text ) → text[]

如果字典知道输入标记，则返回替换词位数组;如果字典知道标记，但它是停止词，则返回空数组;如果它不是已知词，则返回NULL。详情参见第 12.8.3 节。

ts_lexize('english_stem', 'stars') → {star}

ts_parse ( parser_name text, document text ) → setof record ( tokid integer, token text )

使用命名的解析器从document中提取标记。详情参见第 12.8.2 节。

ts_parse('default', 'foo - bar') → (1,foo) ...

ts_parse ( parser_oid oid, document text ) → setof record ( tokid integer, token text )

使用OID指定的解析器从document中提取标记。详请参见第 12.8.2 节。

ts_parse(3722, 'foo - bar') → (1,foo) ...

ts_token_type ( parser_name text ) → setof record ( tokid integer, alias text, description text )

返回一个表，该表描述命名解析器可以识别的每种类型的标记。详请参见第 12.8.2 节。

ts_token_type('default') → (1,asciiword,"Word, all ASCII") ...

ts_token_type ( parser_oid oid ) → setof record ( tokid integer, alias text, description text )

返回一个表，该表描述OID指定的解析器可以识别的每种标记类型。详请参见第 12.8.2 节。

ts_token_type(3722) → (1,asciiword,"Word, all ASCII") ...

ts_stat ( sqlquery text [, weights text ] ) → setof record ( word text, ndoc integer, nentry integer )

执行sqlquery，该查询必须返回单个tsvector列，并返回关于数据中包含的每个不同词位的统计信息。详请参见第 12.4.4 节。

ts_stat('SELECT vector FROM apod') → (foo,10,15) ...

← PostgreSQL 网络地址函数和操作符

PostgreSQL UUID 函数 →

PostgreSQL 文本搜索函数和操作符

注意

推荐文章

推荐教程

最新教程