-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DSIP-78][Data Quality] Remove data quality module #16794
base: dev
Are you sure you want to change the base?
Conversation
Please retry analysis of this Pull-Request directly on SonarQube Cloud |
-1,there're no conclusion for now, please don't do this op |
Based on the discussion of issue #16728, we can draw the following conclusions.
No one among the maintainers on the current community is willing to refactor this module. And data-quality module has seriously blocked the progress of #16098 which is is very important for the next version's release. So I think the conclusion of removal is obvious. |
Apache emphasizes achieving consensus. If there are significant differences and we cannot be resolved in the short term, we may choose to temporarily shelve the proposal, allowing much more time for more information before re-discussing. Considering this is a major decision, I suggest you send a vote email to the dev mailing list. |
First of all, the current consensus does not require all people to reach an agreement, but only more than half. For more info you can take a look at apache/comdev-site#189
Like I said, no one among the maintainers on the current community is willing to refactor this module. How to achieve this without anyone willing to take responsibility?
I think issue and dev mail list have the same meaning on this issue. And all active PMC/Committer have been included in this issue. |
Github Issue can't instead of mail, especially for big event. By the way, i don't disagree with removing this module; I’m just concerned about the significant impact it will have on users. What I see is that many users want to keep this module, while the maintainers are inclined to remove it to make refactoring easier. Therefore, this decision requires great caution. If a vote does not take place in the dev mailing list, then it should happen in private, not in an issue thread. |
I don't think this module is a big event. This feature was introduced in PR #4830 Since Feb 21, 2021 without any mailing or github issue discussion. For more than three years, apart from this PR author, no contributor has contributed to this function, and there are endless issue of bugs and improvement, and there is no substantial code change in this function. This author also doesn't want to maintain this function and vote +1 for removal. |
At that time, there was no DSIP mechanism in place, and much of the communication took place during community meetings. The feature I’m referring to took 8-9 months to develop and implement. If this is not a major feature, I believe that statement would be inaccurate. |
I don't agree with this opinion. Since dolphinscheduler is focusing the modern data orchestration platform. Data-Quality is focusing accuracy and consistency of data. The relationship between two of them is equivalent to Flink and Flink-CDC. There are many mature examples at present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add ddl to drop the dq table.
We can remove the create table ddl in the init sql and give some drop table ddl to users in docs to let users decide whether or not to execute it instead of execute it by default. WDYT? |
How about giving entry point script to remove exists table and package it into binary tarball instead of document, cause we support only two of databases in prod, and we should keep thing easy to use |
I think it will be more traversal for some users to provide users with operating DLLs intuitively since this is a dangerous operation for drop table... |
For proposal only, it seems inconsistency exists, although some of the PMCs in #16728 already agreed to remove it, David is right and he could challenge it and ask to vote in the dev mailing list. So maybe should vote in dev mail thread, And I think the vote result should be the ONLY result of this PR continue or not instead of personal emotions. And for this feature, I think it should better act as a plugin instead of a built-in function. Especially since it has many bugs and CVEs and not team member want to maintain it. So I personally would vote +1 for removing it. And BTW, the time cost for the feature development should not as a standard to measure the importance or not. |
I mean we should add new bash script like https://github.com/apache/dolphinscheduler/blob/dev/dolphinscheduler-tools/src/main/bin/migrate-lineage.sh which is separated form |
Ok. |
I will raise a vote in dev mail list. |
FYI, vote mail in https://lists.apache.org/thread/0tldm33skkbrfgbt01bvd610z5zmb725 |
Purpose of the pull request
close #16728
Brief change log
Verify this pull request
This pull request is code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(or)
Pull Request Notice
Pull Request Notice
If your pull request contain incompatible change, you should also add it to
docs/docs/en/guide/upgrede/incompatible.md