Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot distribute a Citus local table with an identity column #7888

Open
nick-oehmen opened this issue Feb 5, 2025 · 3 comments
Open

Cannot distribute a Citus local table with an identity column #7888

nick-oehmen opened this issue Feb 5, 2025 · 3 comments

Comments

@nick-oehmen
Copy link

Hello! While implementing Citus distributed and reference tables, we encountered an issue attempting to distribute a table that had an identity column. Repro steps below:

Citus version: Citus 12.1.6 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit

--create a test schema
create schema test;

-- add our first local table
create table test.local1(
    id text not null primary key,
    distro_key int not null
);

-- add our reference table
create table test.reference1(
    id int not null primary key,
    reference_col1 text not null
);

-- add our second local table
create table test.local2(
    id int not null generated always as identity,
    local1fk text not null,
    reference1fk int not null,
    constraint loc1fk foreign key (local1fk) references test.local1(id),
    constraint reference1fk foreign key (reference1fk) references test.reference1(id),
    constraint testlocpk primary key (id)
);

-- apply the reference table type
select create_reference_table('test.reference1');

-- add our first local table values
insert into test.local1(id, distro_key) values ('aaaaa', 1), ('bbbbb', 2), ('ccccc', 3);

-- add our reference data
insert into test.reference1(id, reference_col1) values (1, 'test'), (2, 'test2'), (3, 'test3');

-- add our second local table values
insert into test.local2(local1fk, reference1fk) values ('aaaaa', 1), ('aaaaa', 2), ('bbbbb', 3), ('bbbbb', 2), ('ccccc', 1)

-- we confirm that we have two local tables and one reference table
select
    a.table_name,
    b.citus_table_type
from information_schema.tables a
    left join citus_tables b on b.table_name::text = concat(a.table_schema::text, '.', a.table_name::text)
where a.table_schema = 'test'
	and a.table_type = 'BASE TABLE'
order by a.table_name

-- turn our local table into a distributed table on distro key
select create_distributed_table('test.local2', 'local1fk')

-- why does this fail?  Distributed tables should support identity columns in the current Citus version
@naisila
Copy link
Member

naisila commented Feb 7, 2025

Hey Nick,

This operation select create_reference_table('test.reference1');
has converted test.local2 to a Citus local table.

Currently, you cannot convert a Citus local table to a Citus distributed table if they have an identity column.

To overcome this issue, you can distribute the table before create_reference_table('test.reference1');

Given your foreign keys, it will ask you to create the reference table first. So, my advice would be to remove the foreign keys from the table definition, and add them later, after you have distributed the tables.

Also, note that table distribution will still fail, because you have a primary key on id, which doesn't include the distribution column local1fk . So, you should also take care of that.

ERROR:  cannot create constraint on "local2"
DETAIL:  Distributed relations cannot have UNIQUE, EXCLUDE, or PRIMARY KEY constraints
that do not include the partition column (with an equality operator if EXCLUDE).

@nick-oehmen
Copy link
Author

nick-oehmen commented Feb 7, 2025

Thank you @naisila! The example I provided is a pretty simple one just to demonstrate the issue, the actual issue is occurring in a much larger DB as we're testing implementing distribution prior to pushing to the live production DB. In our actual database, all of the foreign key relationships already exist, there is data in the tables, and identities are already present as we're trying to implement distribution to several dozen tables. We'll explore the possibility of disabling all of our foreign keys during deployment and then re-enabling, but I still would consider this somewhat of a workaround. Do you know if this is considered a bug or feature, and is there a timeline to when local tables could be distributed if they have an identity? We could potentially delay our multi-tenancy initiative depending on the timeline.

Re: distributed table not having the distribution key in the primary key of the table, this was just an oversight while I was putting together the example script :)

@naisila
Copy link
Member

naisila commented Feb 10, 2025

We'll explore the possibility of disabling all of our foreign keys during deployment and then re-enabling, but I still would consider this somewhat of a workaround.

Yes, that would be a workaround. I also suggest that you try the workaround first with the simple example that you used to demonstrate the issue, before trying on the larger DB.

is there a timeline to when local tables could be distributed if they have an identity?

Well, rather than distributing a Citus local table, we look at this as "converting a Citus local table to a Citus distributed table". Like, moving from one Citus table type to another. The limitation arises from maintenance of metadata consistency. As you may know, Citus local tables are "managed" by Citus and are actually visible from all nodes. Therefore, they have Citus metadata to be taken care of. Implementation-wise, this is not trivial, and we haven't prioritized this case yet. Given your experience with conversion of citus local tables to distributed ones, we will bump this up in the priority list. I will let you know if/when we have a precise timeline on this. Thanks a lot for sharing your use-case.

Re: distributed table not having the distribution key in the primary key of the table, this was just an oversight while I was putting together the example script :)

Good to know.

@naisila naisila changed the title Cannot distribute table with an identity column Cannot distribute a Citus local table with an identity column Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants