-
Notifications
You must be signed in to change notification settings - Fork 272
[Hubspot] Add de-duplication logic in Hubspot #2924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2924 +/- ##
==========================================
- Coverage 78.11% 78.05% -0.07%
==========================================
Files 1050 1064 +14
Lines 19438 19629 +191
Branches 3734 3770 +36
==========================================
+ Hits 15184 15321 +137
- Misses 2972 3021 +49
- Partials 1282 1287 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Hi @harsh-joshi99 please let me know when this PR is ready to review. I think I should be involved with this PR as I contributed a lot to the Actions being updated. |
Hi @harsh-joshi99 can we schedule a call please? |
type: 'string', | ||
required: false, | ||
default: { '@path': '$.timestamp' }, | ||
dynamic: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @harsh-joshi99 why is this set to be a dynamic field?
@@ -52,6 +52,10 @@ const send = async ( | |||
subscriptionMetadata?: SubscriptionMetadata, | |||
statsContext?: StatsContext | |||
) => { | |||
if (syncMode === 'upsert') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't we also have the same issue if the customer is using syncMode === 'update'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also @harsh-joshi99 @varadarajan-tw can we put this code behind a feature flag to control rollout?
const id = incoming.object_details?.id_field_value | ||
if (!id) continue | ||
|
||
const incomingTimestamp = incoming.timestamp ? new Date(incoming.timestamp).getTime() : 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the customer misconfigures the timestamp field or sends something that isn't a timestamp, then you're falling back to creating a timestamp. However the timestamp you create might correctly order the event.
It might be better to grab the timestamp directly from the raw payload data instead. i.e not via a field. Note that there will always be a timestamp in the raw payload data.
Have a look at how to access the rawData object here.
At least this way you will always be guaranteed to order the events in the order they arrived at the Segment ingestion point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what I'm recommending is:
const incomingTimestamp = isTimestamp(incoming.timestamp) ? incoming.timestamp : getTimestampFromRawPayload()
} | ||
|
||
const existing = mergedMap.get(id)! | ||
const existingTimestamp = existing.timestamp ? new Date(existing.timestamp).getTime() : 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should ensure that there's always a timestamp in the payload that gets added to the mergeMap (see my comment above). This means that if there is an existing payload, you can assume it has a timestamp.
// Merge properties | ||
const mergedProps: Record<string, unknown> = { ...existing.properties } | ||
for (const [key, value] of Object.entries(incoming.properties || {})) { | ||
if (!(key in mergedProps) || incomingTimestamp >= existingTimestamp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this timestamp comparison work if the values are strings (e.g. an ISO8601 string)?
incomingTimestamp >= existingTimestamp
Does it not need to be converted into a format where it can be compared first?
const existingAssociations = existing.associations || [] | ||
const incomingAssociations = incoming.associations || [] | ||
|
||
const associationKey = (assoc: any) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the linter not complain about the user of the 'any' type here?
hi @harsh-joshi99 and @varadarajan-tw I reviewed the PR and left some comments. As well as the comments I think it would be a good idea to add some unit tests to test the new mergeAndDeduplicateById function. And I recommend we put this code behind a feature flag - but that's up to you. |
This PR adds a function to de-duplicate properties in hubspot's custom object v2 action.
Hubspot does not allow properties with same ID.
JIRA -> https://twilio-engineering.atlassian.net/browse/STRATCONN-5746?atlOrigin=eyJpIjoiMzU3ZWIxYzkzODU2NDIyMjg0MWFkYzM4OTc2MjEwZjkiLCJwIjoiaiJ9
Testing
https://docs.google.com/document/d/17cett1m8KfzEvLxxBfzbhiFBgeRwDhzMZYnTzuRZG30/edit?usp=sharing
Include any additional information about the testing you have completed to
ensure your changes behave as expected. For a speedy review, please check
any of the tasks you completed below during your testing.