Skip to content

[FEATURE] Allow Custom PARTITION BY When Using Snowflake ID (bigint) in Initial Migration #161

Open
@zhongjun96

Description

@zhongjun96

Use Case Description

During initial migration, we noticed that the automatically created tables use the following clause:

PARTITION BY intDiv(id, 4294967)

How is the value 4294967 determined?

The Snowflake-style scheme we use (e.g., 1849360358546407424)

when performing batch inserts, this partitioning scheme creates too many partitions triggers the max_partitions_per_insert_block limit.

Since we have hundreds of tables, manually modifying each table's PARTITION BY clause is extremely tedious.

Proposed Solution

Allow users to customize the PARTITION BY clause during automatic table creation.

If a created_at or similar datetime field exists, we would prefer to partition by toYYYYMM(created_at) instead of using a division on the ID.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions