Skip to content

Commit f4e70ce

Browse files
authored
docs: add documents for postgres source (#915)
1 parent 1fa1db5 commit f4e70ce

File tree

1 file changed

+40
-0
lines changed

1 file changed

+40
-0
lines changed

docs/docs/ops/sources.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -282,3 +282,43 @@ The output is a [*KTable*](/docs/core/data_types#ktable) with the following sub
282282
* `filename` (*Str*): the filename of the file, without the path, e.g. `"file1.md"`
283283
* `mime_type` (*Str*): the MIME type of the file.
284284
* `content` (*Str* if `binary` is `False`, otherwise *Bytes*): the content of the file.
285+
286+
287+
## Postgres
288+
289+
The `Postgres` source imports rows from a PostgreSQL table.
290+
291+
### Setup for PostgreSQL
292+
293+
* Ensure the table exists and has a primary key. Tables without a primary key are not supported.
294+
* Grant the connecting user read permissions on the target table (e.g. `SELECT`).
295+
* Provide a database connection. You can:
296+
* Use CocoIndex's default database connection, or
297+
* Provide an explicit connection via a transient auth entry referencing a `DatabaseConnectionSpec` with a `url`, for example:
298+
299+
```python
300+
cocoindex.add_transient_auth_entry(
301+
cocoindex.sources.DatabaseConnectionSpec(
302+
url="postgres://user:password@host:5432/dbname?sslmode=require",
303+
)
304+
)
305+
```
306+
307+
### Spec
308+
309+
The spec takes the following fields:
310+
311+
* `table_name` (`str`): the PostgreSQL table to read from.
312+
* `database` (`cocoindex.TransientAuthEntryReference[DatabaseConnectionSpec]`, optional): database connection reference. If not provided, the default CocoIndex database is used.
313+
* `included_columns` (`list[str]`, optional): non-primary-key columns to include. If not specified, all non-PK columns are included.
314+
* `ordinal_column` (`str`, optional): to specify a non-primary-key column used for change tracking and ordering, e.g. can be a modified timestamp or a monotonic version number. Supported types are integer-like (`bigint`/`integer`) and timestamps (`timestamp`, `timestamptz`).
315+
`ordinal_column` must not be a primary key column.
316+
317+
### Schema
318+
319+
The output is a [*KTable*](/docs/core/data_types#ktable) with fields derived from the table schema:
320+
321+
* Key fields:
322+
* If the table has a single primary key column, that column appears as the key field with its name and type.
323+
* If the table has a composite primary key, a struct field named `_key` contains each PK component as a sub-field.
324+
* Value fields: All non-primary-key columns included by `included_columns` (or all when not specified) appear as value fields.

0 commit comments

Comments
 (0)