Skip to content

Conversation

@raghavio
Copy link

@raghavio raghavio commented Oct 29, 2022

Use the FROM clause in the UPDATE statement to support bulk data update.

Fixes #7

Use the FROM clause in the UPDATE statement to support bulk data update.
@hettie-d
Copy link
Owner

Hi @raghavio - actually, you can do pretty much the same without modifying the function. It's my fault that I never formalized and documented this feature, but the reason why I added the alias t to the update has exactly that! See here:
https://github.com/hettie-d/pg_bitemporal/blob/master/sql/ll_bitemporal_update_select.sql#L35

That's how I used it:

perform bitemporal_internal.ll_bitemporal_update_select(
         'test',
         'table_l',
         'attr_l',
          $$select attr_l
                from test_2.table_a
          where a_id =t.a_id and t.attr_p = 'my_product'$$,
         'a_id',
         array_to_string(v_list,','),
         temporal_relationships.timeperiod(now(), 'infinity') ,
         temporal_relationships.timeperiod(now(), 'infinity') );

As you can see, "t" always denotes the table we are updating, and then you can reference its attributes in the select statement. I think that this covers at least part of the functionality you are looking for.
I will take another look at the beginning of the week. (I am flying to Chicago back from PG Conf EU).

@raghavio
Copy link
Author

Hi @hettie-d, thanks for the quick look. Hope you had fun at the PG Conf!

I'm actually aware of this but this isn't what I need. My change allows updation of multiple records with data that's outside of the table that's being updated. Right now, if the query in p_values_selected_search returns more the one row it will fail. This is what I've tried to fix.
As of now, the ll_bitemporal_update_select function can do three things:

  1. Update a single record.
  2. Update multiple records as long as the values to update is constant.
  3. Update multiple records as long as the values to update uses data from the row that's being updated. (using the t alias in SELECT)

It doesn't allow updating of multiple records with different data coming from the SELECT query. That's what I need.

Pasting my function call from #7:

SELECT bitemporal_internal.ll_bitemporal_update_select(
      'org',
      'test_table,
      'id,first_name,emails',
      $$SELECT id, first_name, emails FROM (VALUES ('4143558004', 'Raghav', CAST(ARRAY['[email protected]'] AS text[])), ('4028052004', 'Hettie', CAST(ARRAY['[email protected]'] AS text[]))) ASt(id, first_name, emails)$$,
      'id',
      $$SELECT id FROM (VALUES ('4143558004', 'Raghav', CAST(ARRAY['[email protected]'] AS text[])), ('4028052004', 'Hettie', CAST(ARRAY['[email protected]'] AS text[]))) AS t(id, first_name, emails)$$,
      TEMPORAL_RELATIONSHIPS.TIMEPERIOD(NOW(), CAST('infinity' AS temporal_relationships.time_endpoint)),
      TEMPORAL_RELATIONSHIPS.TIMEPERIOD(NOW(), CAST('infinity' AS temporal_relationships.time_endpoint)))

If you see, my SELECT query here returns two records. This currently gives me an error saying the subquery returns more the one record.

@hettie-d
Copy link
Owner

Got it! Thank you for the clarification!

@hettie-d
Copy link
Owner

hettie-d commented Nov 7, 2022

Hi @raghavio - my sincere apologies it took so long to get to you. Everything looks great except for one issue. When you construct the update statement, it is assumed that the names of the columns in the "update select" statement are the same as those we are updating. In your example, you explicitly use "AS" to name them the right way, which is perfectly fine when you know that you need to do this. However, if somebody is already using this function in production (like one of my former companies) and is unaware of this new requirement, their code will break. I think that your contribution is very valuable for pg_bitemporal, but it's critical to ensure backward compatibility when I am advocating for broader adoption of this framework.
Based on the above, here is what I am going to suggest:

  • you can name your version of the function differently, and we can merge it right away (and document explicitly what are the new requirements)
  • after that, it would be great if you would add the "alias x list of fields" as a new parameter
  • and after this is done, we can decide how we merge these two functions (ideally, I would love to have it as one function with different signatures, with the default as "the names are the same." And yes, we still need to support the old style.

Do you think this is doable? Please let me know! Thanks:)

@raghavio
Copy link
Author

Hi @hettie-d. Sorry for the late reply. I'll get to this.
I realized that even after this change, the function will still be limited for bulk update. In cases where the effective and asserted are not constant and need to be specific to the updating row. Do you have any ideas how that can be supported?

@hettie-d
Copy link
Owner

Yes, you are right, and I have been thinking about this problem for a long time. I am not sure how to handle that yet, partially because time is in some sense "invisible" for the application (we do all the hard work :)), and if we allow effective/asserted time to vary within one operation, we need to allow injecting some dependencies of effective time on the values which we pass. So I do not have a solution yet, but I am happy to work on it, I know it is needed. For a start, can you give me an example of how the kind of update you are looking for? And BTW - feel free to send me a message on LinkedIn, this may result in a long conversation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Functionality for bulk update

2 participants