PostgreSQL：文件：18：9.4. 字串函式和運算子

支援版本：當前 (18) / 17 / 16 / 15 / 14 / 13

開發版本：devel

不支援的版本：12 / 11 / 10 / 9.6 / 9.5 / 9.4 / 9.3 / 9.2 / 9.1 / 9.0 / 8.4 / 8.3 / 8.2 / 8.1 / 8.0 / 7.4 / 7.3 / 7.2 / 7.1

9.4. 字串函式和運算子
上一步	上一級	第 9 章函式和運算子	首頁	下一步

9.4. 字串函式和運算子 #

9.4.1. format

本節描述用於檢查和操作字串值的函式和運算子。此處的字串包括 character、character varying 和 text 型別的值。除非另有說明，這些函式和運算子宣告為接受並返回 text 型別。它們可以互換地接受 character varying 引數。型別為 character 的值在函式或運算子應用之前會轉換為 text，這將導致 character 值中所有尾隨空格被去除。

SQL定義了一些使用關鍵字而不是逗號來分隔引數的字串函式。詳情請參見表 9.9。PostgreSQL 還提供了使用常規函式呼叫語法（參見表 9.10）的這些函式的版本。

注意

字串連線運算子 (||) 將接受非字串輸入，只要至少有一個輸入是字串型別，如表 9.9所示。對於其他情況，插入顯式強制轉換為 text 可用於接受非字串輸入。

表 9.9. SQL字串函式和運算子

函式/運算子描述示例
`text` `\|\|` `text` → `text` 連線兩個字串。 `'Post' \|\| 'greSQL'` → `PostgreSQL`
`text` `\|\|` `anynonarray` → `text` `anynonarray` `\|\|` `text` → `text` 將非字串輸入轉換為文字，然後連線兩個字串。（非字串輸入不能是陣列型別，因為那會與陣列 `\|\|` 運算子產生歧義。如果您想連線陣列的文字等效項，請顯式將其轉換為 `text`。） `'Value: ' \|\| 42` → `Value: 42`
`btrim` ( `string` `text` [, `characters` `text` ] ) → `text` 從 `string` 的開頭和結尾刪除僅包含 `characters`（預設為空格）的最長字串。 `btrim('xyxtrimyyx', 'xyz')` → `trim`
`text` `IS` [`NOT`] [`form`] `NORMALIZED` → `boolean` 檢查字串是否處於指定的 Unicode 規範化形式。可選的 `form` 關鍵字指定形式：`NFC`（預設）、`NFD`、`NFKC` 或 `NFKD`。此表示式只能在伺服器編碼為 `UTF8` 時使用。請注意，使用此表示式檢查規範化通常比規範化可能已經規範化的字串更快。 `U&'\0061\0308bc' IS NFD NORMALIZED` → `t`
`bit_length` ( `text` ) → `integer` 返回字串中的位數（`octet_length` 的 8 倍）。 `bit_length('jose')` → `32`
`char_length` ( `text` ) → `integer` `character_length` ( `text` ) → `integer` 返回字串中的字元數。 `char_length('josé')` → `4`
`lower` ( `text` ) → `text` 根據資料庫區域設定的規則將字串轉換為小寫。 `lower('TOM')` → `tom`
`lpad` ( `string` `text`, `length` `integer` [, `fill` `text` ] ) → `text` 透過在 `string` 前新增 `fill` 字元（預設為空格）將其擴充套件到 `length` 長度。如果 `string` 已經長於 `length`，則它會被截斷（從右側）。 `lpad('hi', 5, 'xy')` → `xyxhi`
`ltrim` ( `string` `text` [, `characters` `text` ] ) → `text` 從 `string` 的開頭刪除僅包含 `characters`（預設為空格）的最長字串。 `ltrim('zzzytest', 'xyz')` → `test`
`normalize` ( `text` [, `form` ] ) → `text` 將字串轉換為指定的 Unicode 規範化形式。可選的 `form` 關鍵字指定形式：`NFC`（預設）、`NFD`、`NFKC` 或 `NFKD`。此函式只能在伺服器編碼為 `UTF8` 時使用。 `normalize(U&'\0061\0308bc', NFC)` → `U&'\00E4bc'`
`octet_length` ( `text` ) → `integer` 返回字串中的位元組數。 `octet_length('josé')` → `5`（如果伺服器編碼是 UTF8）
`octet_length` ( `character` ) → `integer` 返回字串中的位元組數。由於此版本的函式直接接受 `character` 型別，因此它不會去除尾隨空格。 `octet_length('abc '::character(4))` → `4`
`overlay` ( `string` `text` `PLACING` `newsubstring` `text` `FROM` `start` `integer` [ `FOR` `count` `integer` ] ) → `text` 將 `string` 中從第 `start` 個字元開始並持續 `count` 個字元的子字串替換為 `newsubstring`。如果省略 `count`，則預設為 `newsubstring` 的長度。 `overlay('Txxxxas' placing 'hom' from 2 for 4)` → `Thomas`
`position` ( `substring` `text` `IN` `string` `text` ) → `integer` 返回指定 `substring` 在 `string` 中第一次出現的起始索引，如果不存在則返回零。 `position('om' in 'Thomas')` → `3`
`rpad` ( `string` `text`, `length` `integer` [, `fill` `text` ] ) → `text` 透過在 `string` 後新增 `fill` 字元（預設為空格）將其擴充套件到 `length` 長度。如果 `string` 已經長於 `length`，則它會被截斷。 `rpad('hi', 5, 'xy')` → `hixyx`
`rtrim` ( `string` `text` [, `characters` `text` ] ) → `text` 從 `string` 的末尾刪除僅包含 `characters`（預設為空格）的最長字串。 `rtrim('testxxzx', 'xyz')` → `test`
`substring` ( `string` `text` [ `FROM` `start` `integer` ] [ `FOR` `count` `integer` ] ) → `text` 提取 `string` 的子字串，如果指定了 `start`，則從第 `start` 個字元開始，如果指定了 `count`，則在 `count` 個字元後停止。至少提供 `start` 和 `count` 中的一個。 `substring('Thomas' from 2 for 3)` → `hom` `substring('Thomas' from 3)` → `omas` `substring('Thomas' for 2)` → `Th`
`substring` ( `string` `text` `FROM` `pattern` `text` ) → `text` 提取與 POSIX 正則表示式匹配的第一個子字串；參見第 9.7.3 節。 `substring('Thomas' from '...$')` → `mas`
`substring` ( `string` `text` `SIMILAR` `pattern` `text` `ESCAPE` `escape` `text` ) → `text` `substring` ( `string` `text` `FROM` `pattern` `text` `FOR` `escape` `text` ) → `text` 提取匹配的第一個子字串SQL正則表示式；參見第 9.7.2 節。第一種形式自 SQL:2003 以來已指定；第二種形式僅在 SQL:1999 中，應被視為已過時。 `substring('Thomas' similar '%#"o_a#"_' escape '#')` → `oma`
`trim` ( [ `LEADING` \| `TRAILING` \| `BOTH` ] [ `characters` `text` ] `FROM` `string` `text` ) → `text` 從 `string` 的開頭、結尾或兩端（`BOTH` 為預設值）刪除只包含 `characters`（預設為空格）的最長字串。 `trim(both 'xyz' from 'yxTomxx')` → `Tom`
`trim` ( [ `LEADING` \| `TRAILING` \| `BOTH` ] [ `FROM` ] `string` `text` [, `characters` `text` ] ) → `text` 這是 `trim()` 的非標準語法。 `trim(both from 'yxTomxx', 'xyz')` → `Tom`
`unicode_assigned` ( `text` ) → `boolean` 如果字串中的所有字元都是已分配的 Unicode 程式碼點，則返回 `true`；否則返回 `false`。此函式只能在伺服器編碼為 `UTF8` 時使用。
`upper` ( `text` ) → `text` 根據資料庫區域設定的規則將字串轉換為大寫。 `upper('tom')` → `TOM`

函式/運算子

描述

示例

text || text → text

連線兩個字串。

'Post' || 'greSQL' → PostgreSQL

text || anynonarray → text

anynonarray || text → text

將非字串輸入轉換為文字，然後連線兩個字串。（非字串輸入不能是陣列型別，因為那會與陣列 || 運算子產生歧義。如果您想連線陣列的文字等效項，請顯式將其轉換為 text。）

'Value: ' || 42 → Value: 42

btrim ( string text [, characters text ] ) → text

從 string 的開頭和結尾刪除僅包含 characters（預設為空格）的最長字串。

btrim('xyxtrimyyx', 'xyz') → trim

text IS [NOT] [form] NORMALIZED → boolean

檢查字串是否處於指定的 Unicode 規範化形式。可選的 form 關鍵字指定形式：NFC（預設）、NFD、NFKC 或 NFKD。此表示式只能在伺服器編碼為 UTF8 時使用。請注意，使用此表示式檢查規範化通常比規範化可能已經規範化的字串更快。

U&'\0061\0308bc' IS NFD NORMALIZED → t

bit_length ( text ) → integer

返回字串中的位數（octet_length 的 8 倍）。

bit_length('jose') → 32

char_length ( text ) → integer

character_length ( text ) → integer

返回字串中的字元數。

char_length('josé') → 4

lower ( text ) → text

根據資料庫區域設定的規則將字串轉換為小寫。

lower('TOM') → tom

lpad ( string text, length integer [, fill text ] ) → text

透過在 string 前新增 fill 字元（預設為空格）將其擴充套件到 length 長度。如果 string 已經長於 length，則它會被截斷（從右側）。

lpad('hi', 5, 'xy') → xyxhi

ltrim ( string text [, characters text ] ) → text

從 string 的開頭刪除僅包含 characters（預設為空格）的最長字串。

ltrim('zzzytest', 'xyz') → test

normalize ( text [, form ] ) → text

將字串轉換為指定的 Unicode 規範化形式。可選的 form 關鍵字指定形式：NFC（預設）、NFD、NFKC 或 NFKD。此函式只能在伺服器編碼為 UTF8 時使用。

normalize(U&'\0061\0308bc', NFC) → U&'\00E4bc'

octet_length ( text ) → integer

返回字串中的位元組數。

octet_length('josé') → 5（如果伺服器編碼是 UTF8）

octet_length ( character ) → integer

返回字串中的位元組數。由於此版本的函式直接接受 character 型別，因此它不會去除尾隨空格。

octet_length('abc '::character(4)) → 4

overlay ( string text PLACING newsubstring text FROM start integer [ FOR count integer ] ) → text

將 string 中從第 start 個字元開始並持續 count 個字元的子字串替換為 newsubstring。如果省略 count，則預設為 newsubstring 的長度。

overlay('Txxxxas' placing 'hom' from 2 for 4) → Thomas

position ( substring text IN string text ) → integer

返回指定 substring 在 string 中第一次出現的起始索引，如果不存在則返回零。

position('om' in 'Thomas') → 3

rpad ( string text, length integer [, fill text ] ) → text

透過在 string 後新增 fill 字元（預設為空格）將其擴充套件到 length 長度。如果 string 已經長於 length，則它會被截斷。

rpad('hi', 5, 'xy') → hixyx

rtrim ( string text [, characters text ] ) → text

從 string 的末尾刪除僅包含 characters（預設為空格）的最長字串。

rtrim('testxxzx', 'xyz') → test

substring ( string text [ FROM start integer ] [ FOR count integer ] ) → text

提取 string 的子字串，如果指定了 start，則從第 start 個字元開始，如果指定了 count，則在 count 個字元後停止。至少提供 start 和 count 中的一個。

substring('Thomas' from 2 for 3) → hom

substring('Thomas' from 3) → omas

substring('Thomas' for 2) → Th

substring ( string text FROM pattern text ) → text

提取與 POSIX 正則表示式匹配的第一個子字串；參見第 9.7.3 節。

substring('Thomas' from '...$') → mas

substring ( string text SIMILAR pattern text ESCAPE escape text ) → text

substring ( string text FROM pattern text FOR escape text ) → text

提取匹配的第一個子字串SQL正則表示式；參見第 9.7.2 節。第一種形式自 SQL:2003 以來已指定；第二種形式僅在 SQL:1999 中，應被視為已過時。

substring('Thomas' similar '%#"o_a#"_' escape '#') → oma

trim ( [ LEADING | TRAILING | BOTH ] [ characters text ] FROM string text ) → text

從 string 的開頭、結尾或兩端（BOTH 為預設值）刪除只包含 characters（預設為空格）的最長字串。

trim(both 'xyz' from 'yxTomxx') → Tom

trim ( [ LEADING | TRAILING | BOTH ] [ FROM ] string text [, characters text ] ) → text

這是 trim() 的非標準語法。

trim(both from 'yxTomxx', 'xyz') → Tom

unicode_assigned ( text ) → boolean

如果字串中的所有字元都是已分配的 Unicode 程式碼點，則返回 true；否則返回 false。此函式只能在伺服器編碼為 UTF8 時使用。

upper ( text ) → text

根據資料庫區域設定的規則將字串轉換為大寫。

upper('tom') → TOM

還有其他字串操作函式和運算子，列在表 9.10中。（其中一些在內部用於實現SQL-標準字串函式，列在表 9.9中。）還有模式匹配運算子，在第 9.7 節中描述，以及用於全文搜尋的運算子，在第 12 章中描述。

表 9.10. 其他字串函式和運算子

函式/運算子描述示例
`text` `^@` `text` → `boolean` 如果第一個字串以第二個字串開頭，則返回 true（等效於 `starts_with()` 函式）。 `'alphabet' ^@ 'alph'` → `t`
`ascii` ( `text` ) → `integer` 返回引數第一個字元的數字程式碼。在UTF8編碼中，返回字元的 Unicode 程式碼點。在其他多位元組編碼中，引數必須是ASCII字元。 `ascii('x')` → `120`
`chr` ( `integer` ) → `text` 返回具有給定程式碼的字元。在UTF8編碼中，引數被視為 Unicode 程式碼點。在其他多位元組編碼中，引數必須指定一個ASCII字元。`chr(0)` 被禁止，因為文字資料型別無法儲存該字元。 `chr(65)` → `A`
`concat` ( `val1` `"any"` [, `val2` `"any"` [, ...] ] ) → `text` 連線所有引數的文字表示。NULL 引數將被忽略。 `concat('abcde', 2, NULL, 22)` → `abcde222`
`concat_ws` ( `sep` `text`, `val1` `"any"` [, `val2` `"any"` [, ...] ] ) → `text` 連線除第一個引數外的所有引數，並使用分隔符。第一個引數用作分隔字串，不應為 NULL。其他 NULL 引數將被忽略。 `concat_ws(',', 'abcde', 2, NULL, 22)` → `abcde,2,22`
`format` ( `formatstr` `text` [, `formatarg` `"any"` [, ...] ] ) → `text` 根據格式字串格式化引數；參見第 9.4.1 節。此函式類似於 C 函式 `sprintf`。 `format('Hello %s, %1$s', 'World')` → `Hello World, World`
`initcap` ( `text` ) → `text` 將每個單詞的首字母轉換為大寫，其餘轉換為小寫。單詞是字母數字字元序列，由非字母數字字元分隔。 `initcap('hi THOMAS')` → `Hi Thomas`
`casefold` ( `text` ) → `text` 根據排序規則對輸入字串執行大小寫摺疊。大小寫摺疊類似於大小寫轉換，但大小寫摺疊的目的是方便字串的不區分大小寫匹配，而大小寫轉換的目的是轉換為特定的大小寫形式。此函式只能在伺服器編碼為 `UTF8` 時使用。通常，大小寫摺疊只是轉換為小寫，但根據排序規則可能會有例外。例如，某些字元有兩個以上的小寫變體，或者摺疊為大寫。大小寫摺疊可能會改變字串的長度。例如，在 `PG_UNICODE_FAST` 排序規則中，`ß` (U+00DF) 摺疊為 `ss`。 `casefold` 可用於 Unicode 預設不區分大小寫匹配。它並不總是保留輸入字串的規範化形式（參見normalize）。 `libc` 提供程式不支援大小寫摺疊，因此 `casefold` 與 lower 相同。
`left` ( `string` `text`, `n` `integer` ) → `text` 返回字串中的前 `n` 個字元，當 `n` 為負數時，返回除最後 \|`n`\| 個字元之外的所有字元。 `left('abcde', 2)` → `ab`
`length` ( `text` ) → `integer` 返回字串中的字元數。 `length('jose')` → `4`
`md5` ( `text` ) → `text` 計算引數的 MD5 雜湊值，結果以十六進位制表示。 `md5('abc')` → `900150983cd24fb0d6963f7d28e17f72`
`parse_ident` ( `qualified_identifier` `text` [, `strict_mode` `boolean` `DEFAULT` `true` ] ) → `text[]` 將 `qualified_identifier` 分割成一個識別符號陣列，並刪除單個識別符號的任何引號。預設情況下，最後一個識別符號之後的額外字元被視為錯誤；但如果第二個引數為 `false`，則忽略這些額外字元。（此行為對於解析函式等物件的名稱很有用。）請注意，此函式不會截斷過長的識別符號。如果需要截斷，可以將結果轉換為 `name[]`。 `parse_ident('"SomeSchema".someTable')` → `{SomeSchema,sometable}`
`pg_client_encoding` ( ) → `name` 返回當前客戶端編碼名稱。 `pg_client_encoding()` → `UTF8`
`quote_ident` ( `text` ) → `text` 返回給定字串，並適當引用，以便在SQL語句字串中用作識別符號。僅在必要時（即，如果字串包含非識別符號字元或將被大小寫摺疊）才新增引號。嵌入的引號會正確地加倍。另請參見示例 41.1。 `quote_ident('Foo bar')` → `"Foo bar"`
`quote_literal` ( `text` ) → `text` 返回給定字串，並適當引用，以便在SQL語句字串中用作字串文字。嵌入的單引號和反斜槓會正確地加倍。請注意，`quote_literal` 在空輸入時返回空；如果引數可能為空，則 `quote_nullable` 通常更合適。另請參見示例 41.1。 `quote_literal(E'O\'Reilly')` → `'O''Reilly'`
`quote_literal` ( `anyelement` ) → `text` 將給定值轉換為文字，然後將其作為文字引用。嵌入的單引號和反斜槓會正確地加倍。 `quote_literal(42.5)` → `'42.5'`
`quote_nullable` ( `text` ) → `text` 返回給定字串，並適當引用，以便在SQL語句字串；或者，如果引數為空，則返回 `NULL`。嵌入的單引號和反斜槓會正確地加倍。另請參見示例 41.1。 `quote_nullable(NULL)` → `NULL`
`quote_nullable` ( `anyelement` ) → `text` 將給定值轉換為文字，然後將其作為文字引用；或者，如果引數為空，則返回 `NULL`。嵌入的單引號和反斜槓會正確地加倍。 `quote_nullable(42.5)` → `'42.5'`
`regexp_count` ( `string` `text`, `pattern` `text` [, `start` `integer` [, `flags` `text` ] ] ) → `integer` 返回 POSIX 正則表示式 `pattern` 在 `string` 中匹配的次數；參見第 9.7.3 節。 `regexp_count('123456789012', '\d\d\d', 2)` → `3`
`regexp_instr` ( `string` `text`, `pattern` `text` [, `start` `integer` [, `N` `integer` [, `endoption` `integer` [, `flags` `text` [, `subexpr` `integer` ] ] ] ] ] ) → `integer` 返回 POSIX 正則表示式 `pattern` 的第 `N` 次匹配在 `string` 中的位置，如果不存在這樣的匹配則返回零；參見第 9.7.3 節。 `regexp_instr('ABCDEF', 'c(.)(..)', 1, 1, 0, 'i')` → `3` `regexp_instr('ABCDEF', 'c(.)(..)', 1, 1, 0, 'i', 2)` → `5`
`regexp_like` ( `string` `text`, `pattern` `text` [, `flags` `text` ] ) → `boolean` 檢查 POSIX 正則表示式 `pattern` 是否在 `string` 中出現匹配；參見第 9.7.3 節。 `regexp_like('Hello World', 'world$', 'i')` → `t`
`regexp_match` ( `string` `text`, `pattern` `text` [, `flags` `text` ] ) → `text[]` 返回與 POSIX 正則表示式 `pattern` 在 `string` 中的第一次匹配內的子字串；參見第 9.7.3 節。 `regexp_match('foobarbequebaz', '(bar)(beque)')` → `{bar,beque}`
`regexp_matches` ( `string` `text`, `pattern` `text` [, `flags` `text` ] ) → `setof text[]` 返回 POSIX 正則表示式 `pattern` 在 `string` 中的第一次匹配內的子字串，如果使用了 `g` 標誌，則返回所有此類匹配內的子字串；參見第 9.7.3 節。 `regexp_matches('foobarbequebaz', 'ba.', 'g')` → {bar} {baz}
`regexp_replace` ( `string` `text`, `pattern` `text`, `replacement` `text` [, `flags` `text` ] ) → `text` 替換與 POSIX 正則表示式 `pattern` 第一次匹配的子字串，如果使用 `g` 標誌，則替換所有此類匹配的子字串；參見第 9.7.3 節。 `regexp_replace('Thomas', '.[mN]a.', 'M')` → `ThM`
`regexp_replace` ( `string` `text`, `pattern` `text`, `replacement` `text`, `start` `integer` [, `N` `integer` [, `flags` `text` ] ] ) → `text` 替換與 POSIX 正則表示式 `pattern` 的第 `N` 次匹配的子字串，如果 `N` 為零則替換所有此類匹配，搜尋從 `string` 的第 `start` 個字元開始。如果省略 `N`，則預設為 1。參見第 9.7.3 節。 `regexp_replace('Thomas', '.', 'X', 3, 2)` → `ThoXas` `regexp_replace(string=>'hello world', pattern=>'l', replacement=>'XX', start=>1, "N"=>2)` → `helXXo world`
`regexp_split_to_array` ( `string` `text`, `pattern` `text` [, `flags` `text` ] ) → `text[]` 使用 POSIX 正則表示式作為分隔符分割 `string`，生成一個結果陣列；參見第 9.7.3 節。 `regexp_split_to_array('hello world', '\s+')` → `{hello,world}`
`regexp_split_to_table` ( `string` `text`, `pattern` `text` [, `flags` `text` ] ) → `setof text` 使用 POSIX 正則表示式作為分隔符分割 `string`，生成一組結果；參見第 9.7.3 節。 `regexp_split_to_table('hello world', '\s+')` → hello world
`regexp_substr` ( `string` `text`, `pattern` `text` [, `start` `integer` [, `N` `integer` [, `flags` `text` [, `subexpr` `integer` ] ] ] ] ) → `text` 返回 `string` 中匹配 POSIX 正則表示式 `pattern` 的第 `N` 次出現的子字串，如果沒有此類匹配則返回 `NULL`；參見第 9.7.3 節。 `regexp_substr('ABCDEF', 'c(.)(..)', 1, 1, 'i')` → `CDEF` `regexp_substr('ABCDEF', 'c(.)(..)', 1, 1, 'i', 2)` → `EF`
`repeat` ( `string` `text`, `number` `integer` ) → `text` 將 `string` 重複指定的 `number` 次。 `repeat('Pg', 4)` → `PgPgPgPg`
`replace` ( `string` `text`, `from` `text`, `to` `text` ) → `text` 將 `string` 中所有出現的子字串 `from` 替換為子字串 `to`。 `replace('abcdefabcdef', 'cd', 'XX')` → `abXXefabXXef`
`reverse` ( `text` ) → `text` 反轉字串中字元的順序。 `reverse('abcde')` → `edcba`
`right` ( `string` `text`, `n` `integer` ) → `text` 返回字串中的後 `n` 個字元，當 `n` 為負數時，返回除前 \|`n`\| 個字元之外的所有字元。 `right('abcde', 2)` → `de`
`split_part` ( `string` `text`, `delimiter` `text`, `n` `integer` ) → `text` 在 `delimiter` 出現的位置分割 `string` 並返回第 `n` 個欄位（從一開始計數），當 `n` 為負數時，返回倒數第 \|`n`\| 個欄位。 `split_part('abc~@~def~@~ghi', '~@~', 2)` → `def` `split_part('abc,def,ghi,jkl', ',', -2)` → `ghi`
`starts_with` ( `string` `text`, `prefix` `text` ) → `boolean` 如果 `string` 以 `prefix` 開頭，則返回 true。 `starts_with('alphabet', 'alph')` → `t`
`string_to_array` ( `string` `text`, `delimiter` `text` [, `null_string` `text` ] ) → `text[]` 在 `delimiter` 出現的位置分割 `string`，並將結果欄位組成一個 `text` 陣列。如果 `delimiter` 為 `NULL`，則 `string` 中的每個字元都將成為陣列中的一個單獨元素。如果 `delimiter` 是一個空字串，則 `string` 被視為單個欄位。如果提供了 `null_string` 且不為 `NULL`，則與該字串匹配的欄位將被替換為 `NULL`。另請參見`array_to_string`。 `string_to_array('xx~~yy~~zz', '~~', 'yy')` → `{xx,NULL,zz}`
`string_to_table` ( `string` `text`, `delimiter` `text` [, `null_string` `text` ] ) → `setof text` 在 `delimiter` 出現的位置分割 `string`，並將結果欄位作為一組 `text` 行返回。如果 `delimiter` 為 `NULL`，則 `string` 中的每個字元都將成為結果的一個單獨行。如果 `delimiter` 是一個空字串，則 `string` 被視為單個欄位。如果提供了 `null_string` 且不為 `NULL`，則與該字串匹配的欄位將被替換為 `NULL`。 `string_to_table('xx~^~yy~^~zz', '~^~', 'yy')` → xx NULL zz
`strpos` ( `string` `text`, `substring` `text` ) → `integer` 返回指定 `substring` 在 `string` 中第一次出現的起始索引，如果不存在則返回零。（與 `position(substring in string)` 相同，但請注意引數順序相反。） `strpos('high', 'ig')` → `2`
`substr` ( `string` `text`, `start` `integer` [, `count` `integer` ] ) → `text` 提取 `string` 的子字串，從第 `start` 個字元開始，如果指定了 `count`，則持續 `count` 個字元。（與 `substring(string from start for count)` 相同。） `substr('alphabet', 3)` → `phabet` `substr('alphabet', 3, 2)` → `ph`
`to_ascii` ( `string` `text` ) → `text` `to_ascii` ( `string` `text`, `encoding` `name` ) → `text` `to_ascii` ( `string` `text`, `encoding` `integer` ) → `text` 將 `string` 轉換為ASCII從另一種編碼中轉換而來，該編碼可以透過名稱或數字標識。如果省略 `encoding`，則假定為資料庫編碼（實際上這是唯一有用的情況）。轉換主要包括去除重音。僅支援從 `LATIN1`、`LATIN2`、`LATIN9` 和 `WIN1250` 編碼進行轉換。（有關另一種更靈活的解決方案，請參見 unaccent 模組。） `to_ascii('Karél')` → `Karel`
`to_bin` ( `integer` ) → `text` `to_bin` ( `bigint` ) → `text` 將數字轉換為其等效的二進位制補碼錶示。 `to_bin(2147483647)` → `1111111111111111111111111111111` `to_bin(-1234)` → `11111111111111111111101100101110`
`to_hex` ( `integer` ) → `text` `to_hex` ( `bigint` ) → `text` 將數字轉換為其等效的二進位制補碼十六進位制表示。 `to_hex(2147483647)` → `7fffffff` `to_hex(-1234)` → `fffffb2e`
`to_oct` ( `integer` ) → `text` `to_oct` ( `bigint` ) → `text` 將數字轉換為其等效的二進位制補碼八進位制表示。 `to_oct(2147483647)` → `17777777777` `to_oct(-1234)` → `37777775456`
`translate` ( `string` `text`, `from` `text`, `to` `text` ) → `text` 將 `string` 中與 `from` 集合中的字元匹配的每個字元替換為 `to` 集合中對應的字元。如果 `from` 比 `to` 長，則刪除 `from` 中多餘字元的出現。 `translate('12345', '143', 'ax')` → `a2x5`
`unistr` ( `text` ) → `text` 評估引數中的轉義 Unicode 字元。Unicode 字元可以指定為 `\XXXX`（4 位十六進位制數字）、`\+XXXXXX`（6 位十六進位制數字）、`\uXXXX`（4 位十六進位制數字）或 `\UXXXXXXXX`（8 位十六進位制數字）。要指定反斜槓，請寫入兩個反斜槓。所有其他字元均按字面處理。如果伺服器編碼不是 UTF-8，則由這些轉義序列之一標識的 Unicode 程式碼點將轉換為實際的伺服器編碼；如果無法轉換，則報告錯誤。此函式提供了字串常量帶 Unicode 轉義序列的（非標準）替代方案（參見第 4.1.2.3 節）。 `unistr('d\0061t\+000061')` → `data` `unistr('d\u0061t\U00000061')` → `data`

函式/運算子

描述

示例

text ^@ text → boolean

如果第一個字串以第二個字串開頭，則返回 true（等效於 starts_with() 函式）。

'alphabet' ^@ 'alph' → t

ascii ( text ) → integer

返回引數第一個字元的數字程式碼。在UTF8編碼中，返回字元的 Unicode 程式碼點。在其他多位元組編碼中，引數必須是ASCII字元。

ascii('x') → 120

chr ( integer ) → text

返回具有給定程式碼的字元。在UTF8編碼中，引數被視為 Unicode 程式碼點。在其他多位元組編碼中，引數必須指定一個ASCII字元。chr(0) 被禁止，因為文字資料型別無法儲存該字元。

chr(65) → A

concat ( val1 "any" [, val2 "any" [, ...] ] ) → text

連線所有引數的文字表示。NULL 引數將被忽略。

concat('abcde', 2, NULL, 22) → abcde222

concat_ws ( sep text, val1 "any" [, val2 "any" [, ...] ] ) → text

連線除第一個引數外的所有引數，並使用分隔符。第一個引數用作分隔字串，不應為 NULL。其他 NULL 引數將被忽略。

concat_ws(',', 'abcde', 2, NULL, 22) → abcde,2,22

format ( formatstr text [, formatarg "any" [, ...] ] ) → text

根據格式字串格式化引數；參見第 9.4.1 節。此函式類似於 C 函式 sprintf。

format('Hello %s, %1$s', 'World') → Hello World, World

initcap ( text ) → text

將每個單詞的首字母轉換為大寫，其餘轉換為小寫。單詞是字母數字字元序列，由非字母數字字元分隔。

initcap('hi THOMAS') → Hi Thomas

casefold ( text ) → text

根據排序規則對輸入字串執行大小寫摺疊。大小寫摺疊類似於大小寫轉換，但大小寫摺疊的目的是方便字串的不區分大小寫匹配，而大小寫轉換的目的是轉換為特定的大小寫形式。此函式只能在伺服器編碼為 UTF8 時使用。

通常，大小寫摺疊只是轉換為小寫，但根據排序規則可能會有例外。例如，某些字元有兩個以上的小寫變體，或者摺疊為大寫。

大小寫摺疊可能會改變字串的長度。例如，在 PG_UNICODE_FAST 排序規則中，ß (U+00DF) 摺疊為 ss。

casefold 可用於 Unicode 預設不區分大小寫匹配。它並不總是保留輸入字串的規範化形式（參見normalize）。

libc 提供程式不支援大小寫摺疊，因此 casefold 與 lower 相同。

left ( string text, n integer ) → text

返回字串中的前 n 個字元，當 n 為負數時，返回除最後 |n| 個字元之外的所有字元。

left('abcde', 2) → ab

length ( text ) → integer

返回字串中的字元數。

length('jose') → 4

md5 ( text ) → text

計算引數的 MD5 雜湊值，結果以十六進位制表示。

md5('abc') → 900150983cd24fb0d6963f7d28e17f72

parse_ident ( qualified_identifier text [, strict_mode boolean DEFAULT true ] ) → text[]

將 qualified_identifier 分割成一個識別符號陣列，並刪除單個識別符號的任何引號。預設情況下，最後一個識別符號之後的額外字元被視為錯誤；但如果第二個引數為 false，則忽略這些額外字元。（此行為對於解析函式等物件的名稱很有用。）請注意，此函式不會截斷過長的識別符號。如果需要截斷，可以將結果轉換為 name[]。

parse_ident('"SomeSchema".someTable') → {SomeSchema,sometable}

pg_client_encoding ( ) → name

返回當前客戶端編碼名稱。

pg_client_encoding() → UTF8

quote_ident ( text ) → text

返回給定字串，並適當引用，以便在SQL語句字串中用作識別符號。僅在必要時（即，如果字串包含非識別符號字元或將被大小寫摺疊）才新增引號。嵌入的引號會正確地加倍。另請參見示例 41.1。

quote_ident('Foo bar') → "Foo bar"

quote_literal ( text ) → text

返回給定字串，並適當引用，以便在SQL語句字串中用作字串文字。嵌入的單引號和反斜槓會正確地加倍。請注意，quote_literal 在空輸入時返回空；如果引數可能為空，則 quote_nullable 通常更合適。另請參見示例 41.1。

quote_literal(E'O\'Reilly') → 'O''Reilly'

quote_literal ( anyelement ) → text

將給定值轉換為文字，然後將其作為文字引用。嵌入的單引號和反斜槓會正確地加倍。

quote_literal(42.5) → '42.5'

quote_nullable ( text ) → text

返回給定字串，並適當引用，以便在SQL語句字串；或者，如果引數為空，則返回 NULL。嵌入的單引號和反斜槓會正確地加倍。另請參見示例 41.1。

quote_nullable(NULL) → NULL

quote_nullable ( anyelement ) → text

將給定值轉換為文字，然後將其作為文字引用；或者，如果引數為空，則返回 NULL。嵌入的單引號和反斜槓會正確地加倍。

quote_nullable(42.5) → '42.5'

regexp_count ( string text, pattern text [, start integer [, flags text ] ] ) → integer

返回 POSIX 正則表示式 pattern 在 string 中匹配的次數；參見第 9.7.3 節。

regexp_count('123456789012', '\d\d\d', 2) → 3

regexp_instr ( string text, pattern text [, start integer [, N integer [, endoption integer [, flags text [, subexpr integer ] ] ] ] ] ) → integer

返回 POSIX 正則表示式 pattern 的第 N 次匹配在 string 中的位置，如果不存在這樣的匹配則返回零；參見第 9.7.3 節。

regexp_instr('ABCDEF', 'c(.)(..)', 1, 1, 0, 'i') → 3

regexp_instr('ABCDEF', 'c(.)(..)', 1, 1, 0, 'i', 2) → 5

regexp_like ( string text, pattern text [, flags text ] ) → boolean

檢查 POSIX 正則表示式 pattern 是否在 string 中出現匹配；參見第 9.7.3 節。

regexp_like('Hello World', 'world$', 'i') → t

regexp_match ( string text, pattern text [, flags text ] ) → text[]

返回與 POSIX 正則表示式 pattern 在 string 中的第一次匹配內的子字串；參見第 9.7.3 節。

regexp_match('foobarbequebaz', '(bar)(beque)') → {bar,beque}

regexp_matches ( string text, pattern text [, flags text ] ) → setof text[]

返回 POSIX 正則表示式 pattern 在 string 中的第一次匹配內的子字串，如果使用了 g 標誌，則返回所有此類匹配內的子字串；參見第 9.7.3 節。

regexp_matches('foobarbequebaz', 'ba.', 'g') →

 {bar}
 {baz}

regexp_replace ( string text, pattern text, replacement text [, flags text ] ) → text

替換與 POSIX 正則表示式 pattern 第一次匹配的子字串，如果使用 g 標誌，則替換所有此類匹配的子字串；參見第 9.7.3 節。

regexp_replace('Thomas', '.[mN]a.', 'M') → ThM

regexp_replace ( string text, pattern text, replacement text, start integer [, N integer [, flags text ] ] ) → text

替換與 POSIX 正則表示式 pattern 的第 N 次匹配的子字串，如果 N 為零則替換所有此類匹配，搜尋從 string 的第 start 個字元開始。如果省略 N，則預設為 1。參見第 9.7.3 節。

regexp_replace('Thomas', '.', 'X', 3, 2) → ThoXas

regexp_replace(string=>'hello world', pattern=>'l', replacement=>'XX', start=>1, "N"=>2) → helXXo world

regexp_split_to_array ( string text, pattern text [, flags text ] ) → text[]

使用 POSIX 正則表示式作為分隔符分割 string，生成一個結果陣列；參見第 9.7.3 節。

regexp_split_to_array('hello world', '\s+') → {hello,world}

regexp_split_to_table ( string text, pattern text [, flags text ] ) → setof text

使用 POSIX 正則表示式作為分隔符分割 string，生成一組結果；參見第 9.7.3 節。

regexp_split_to_table('hello world', '\s+') →

 hello
 world

regexp_substr ( string text, pattern text [, start integer [, N integer [, flags text [, subexpr integer ] ] ] ] ) → text

返回 string 中匹配 POSIX 正則表示式 pattern 的第 N 次出現的子字串，如果沒有此類匹配則返回 NULL；參見第 9.7.3 節。

regexp_substr('ABCDEF', 'c(.)(..)', 1, 1, 'i') → CDEF

regexp_substr('ABCDEF', 'c(.)(..)', 1, 1, 'i', 2) → EF

repeat ( string text, number integer ) → text

將 string 重複指定的 number 次。

repeat('Pg', 4) → PgPgPgPg

replace ( string text, from text, to text ) → text

將 string 中所有出現的子字串 from 替換為子字串 to。

replace('abcdefabcdef', 'cd', 'XX') → abXXefabXXef

reverse ( text ) → text

反轉字串中字元的順序。

reverse('abcde') → edcba

right ( string text, n integer ) → text

返回字串中的後 n 個字元，當 n 為負數時，返回除前 |n| 個字元之外的所有字元。

right('abcde', 2) → de

split_part ( string text, delimiter text, n integer ) → text

在 delimiter 出現的位置分割 string 並返回第 n 個欄位（從一開始計數），當 n 為負數時，返回倒數第 |n| 個欄位。

split_part('abc~@~def~@~ghi', '~@~', 2) → def

split_part('abc,def,ghi,jkl', ',', -2) → ghi

starts_with ( string text, prefix text ) → boolean

如果 string 以 prefix 開頭，則返回 true。

starts_with('alphabet', 'alph') → t

string_to_array ( string text, delimiter text [, null_string text ] ) → text[]

在 delimiter 出現的位置分割 string，並將結果欄位組成一個 text 陣列。如果 delimiter 為 NULL，則 string 中的每個字元都將成為陣列中的一個單獨元素。如果 delimiter 是一個空字串，則 string 被視為單個欄位。如果提供了 null_string 且不為 NULL，則與該字串匹配的欄位將被替換為 NULL。另請參見array_to_string。

string_to_array('xx~~yy~~zz', '~~', 'yy') → {xx,NULL,zz}

string_to_table ( string text, delimiter text [, null_string text ] ) → setof text

在 delimiter 出現的位置分割 string，並將結果欄位作為一組 text 行返回。如果 delimiter 為 NULL，則 string 中的每個字元都將成為結果的一個單獨行。如果 delimiter 是一個空字串，則 string 被視為單個欄位。如果提供了 null_string 且不為 NULL，則與該字串匹配的欄位將被替換為 NULL。

string_to_table('xx~^~yy~^~zz', '~^~', 'yy') →

 xx
 NULL
 zz

strpos ( string text, substring text ) → integer

返回指定 substring 在 string 中第一次出現的起始索引，如果不存在則返回零。（與 position(substring in string) 相同，但請注意引數順序相反。）

strpos('high', 'ig') → 2

substr ( string text, start integer [, count integer ] ) → text

提取 string 的子字串，從第 start 個字元開始，如果指定了 count，則持續 count 個字元。（與 substring(string from start for count) 相同。）

substr('alphabet', 3) → phabet

substr('alphabet', 3, 2) → ph

to_ascii ( string text ) → text

to_ascii ( string text, encoding name ) → text

to_ascii ( string text, encoding integer ) → text

將 string 轉換為ASCII從另一種編碼中轉換而來，該編碼可以透過名稱或數字標識。如果省略 encoding，則假定為資料庫編碼（實際上這是唯一有用的情況）。轉換主要包括去除重音。僅支援從 LATIN1、LATIN2、LATIN9 和 WIN1250 編碼進行轉換。（有關另一種更靈活的解決方案，請參見 unaccent 模組。）

to_ascii('Karél') → Karel

to_bin ( integer ) → text

to_bin ( bigint ) → text

將數字轉換為其等效的二進位制補碼錶示。

to_bin(2147483647) → 1111111111111111111111111111111

to_bin(-1234) → 11111111111111111111101100101110

to_hex ( integer ) → text

to_hex ( bigint ) → text

將數字轉換為其等效的二進位制補碼十六進位制表示。

to_hex(2147483647) → 7fffffff

to_hex(-1234) → fffffb2e

to_oct ( integer ) → text

to_oct ( bigint ) → text

將數字轉換為其等效的二進位制補碼八進位制表示。

to_oct(2147483647) → 17777777777

to_oct(-1234) → 37777775456

translate ( string text, from text, to text ) → text

將 string 中與 from 集合中的字元匹配的每個字元替換為 to 集合中對應的字元。如果 from 比 to 長，則刪除 from 中多餘字元的出現。

translate('12345', '143', 'ax') → a2x5

unistr ( text ) → text

評估引數中的轉義 Unicode 字元。Unicode 字元可以指定為 \XXXX（4 位十六進位制數字）、\+XXXXXX（6 位十六進位制數字）、\uXXXX（4 位十六進位制數字）或 \UXXXXXXXX（8 位十六進位制數字）。要指定反斜槓，請寫入兩個反斜槓。所有其他字元均按字面處理。

如果伺服器編碼不是 UTF-8，則由這些轉義序列之一標識的 Unicode 程式碼點將轉換為實際的伺服器編碼；如果無法轉換，則報告錯誤。

此函式提供了字串常量帶 Unicode 轉義序列的（非標準）替代方案（參見第 4.1.2.3 節）。

unistr('d\0061t\+000061') → data

unistr('d\u0061t\U00000061') → data

concat、concat_ws 和 format 函式是變長引數函式，因此可以將要連線或格式化的值作為標記有 VARIADIC 關鍵字的陣列傳遞（參見第 36.5.6 節）。陣列的元素被視為函式的獨立普通引數。如果變長陣列引數為 NULL，則 concat 和 concat_ws 返回 NULL，但 format 將 NULL 視為零元素陣列。

另請參見第 9.21 節中的聚合函式 string_agg，以及表 9.13中用於字串和 bytea 型別之間轉換的函式。

9.4.1. `format` #

函式 format 以類似於 C 函式 sprintf 的樣式，根據格式字串生成輸出。

format(formatstr text [, formatarg "any" [, ...] ])

formatstr 是一個格式字串，它指定結果應如何格式化。格式字串中的文字直接複製到結果中，但使用格式說明符的地方除外。格式說明符充當字串中的佔位符，定義後續函式引數應如何格式化並插入到結果中。每個 formatarg 引數根據其資料型別的常規輸出規則轉換為文字，然後根據格式說明符進行格式化並插入到結果字串中。

格式說明符以 % 字元開頭，形式為

%[position][flags][width]type

其中組成欄位是

position（可選）

形式為 n$ 的字串，其中 n 是要列印的引數的索引。索引 1 表示 formatstr 之後的第一個引數。如果省略 position，則預設使用序列中的下一個引數。

flags（可選）

控制格式說明符輸出格式的其他選項。目前唯一支援的標誌是負號 (-)，它將導致格式說明符的輸出左對齊。除非還指定了 width 欄位，否則這無效。

width（可選）

指定用於顯示格式說明符輸出的最小字元數。根據需要（取決於 - 標誌），用空格在左側或右側填充輸出以達到寬度。過小的寬度不會導致輸出截斷，而是簡單地被忽略。寬度可以使用以下任何一種方式指定：一個正整數；一個星號 (*) 以使用下一個函式引數作為寬度；或形式為 *n$ 的字串以使用第 n 個函式引數作為寬度。

如果寬度來自函式引數，則該引數會在用於格式說明符值的引數之前被消耗。如果寬度引數為負數，則結果將左對齊（如同指定了 - 標誌一樣），在長度為 abs(width) 的欄位內。

type（必需）

用於生成格式說明符輸出的格式轉換型別。支援以下型別

s 將引數值格式化為簡單字串。NULL 值被視為空字串。
I 將引數值視為 SQL 識別符號，必要時加雙引號。值為 NULL 是一個錯誤（等同於 quote_ident）。
L 將引數值作為 SQL 文字引用。NULL 值顯示為字串 NULL，不帶引號（等同於 quote_nullable）。

除了上述格式說明符之外，特殊序列 %% 可用於輸出文字 % 字元。

以下是一些基本格式轉換的示例

SELECT format('Hello %s', 'World');
Result: Hello World

SELECT format('Testing %s, %s, %s, %%', 'one', 'two', 'three');
Result: Testing one, two, three, %

SELECT format('INSERT INTO %I VALUES(%L)', 'Foo bar', E'O\'Reilly');
Result: INSERT INTO "Foo bar" VALUES('O''Reilly')

SELECT format('INSERT INTO %I VALUES(%L)', 'locations', 'C:\Program Files');
Result: INSERT INTO locations VALUES('C:\Program Files')

以下是使用 width 欄位和 - 標誌的示例

SELECT format('|%10s|', 'foo');
Result: |       foo|

SELECT format('|%-10s|', 'foo');
Result: |foo       |

SELECT format('|%*s|', 10, 'foo');
Result: |       foo|

SELECT format('|%*s|', -10, 'foo');
Result: |foo       |

SELECT format('|%-*s|', 10, 'foo');
Result: |foo       |

SELECT format('|%-*s|', -10, 'foo');
Result: |foo       |

這些示例顯示了 position 欄位的使用

SELECT format('Testing %3$s, %2$s, %1$s', 'one', 'two', 'three');
Result: Testing three, two, one

SELECT format('|%*2$s|', 'foo', 10, 'bar');
Result: |       bar|

SELECT format('|%1$*2$s|', 'foo', 10, 'bar');
Result: |       foo|

與標準 C 函式 sprintf 不同，PostgreSQL 的 format 函式允許在同一格式字串中混合使用帶和不帶 position 欄位的格式說明符。不帶 position 欄位的格式說明符始終使用上次消耗引數之後的下一個引數。此外，format 函式不要求所有函式引數都在格式字串中使用。例如

SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
Result: Testing three, two, three

%I 和 %L 格式說明符對於安全地構建動態 SQL 語句特別有用。參見示例 41.1。

上一步	上一級	下一步
9.3. 數學函式和運算子	首頁	9.5. 二進位制字串函式和運算子

提交更正

如果您在文件中發現任何不正確、與您使用特定功能的經驗不符或需要進一步澄清的內容，請使用此表單報告文件問題。

9.4. 字串函式和運算子 #

注意

9.4.1. format #

提交更正

9.4.1. `format` #