-
Notifications
You must be signed in to change notification settings - Fork 477
Prevents iterator conflicts #6040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 2.1
Are you sure you want to change the base?
Conversation
This commit adds checks when adding an iterator that the given iterator does not conflict with any existing iterators. Conflict meaning same name or same priority. Iterators can be added several ways, and previously only TableOperations.attachIterator and NamespaceOperations.attachIterator would check for conflicts. This commit adds iterator conflict checks to: - Scanner.addScanIterator - TableOperations.setProperty - TableOperations.modifyProperties - NewTableConfiguration.attachIterator Note that this does not add conflict checks to NamespaceOperations.setProperty or NamespaceOperations.modifyProperties, these will be done in another commit. This commit also accounts for the several ways in which conflicts can arise: - Iterators that are attached directly to a table (either through TableOperations.attachIterator, TableOperations.setProperty, or TableOperations.modifyProperties) - Iterators that are attached to a namespace, inherited by a table (either through NamespaceOperations.attachIterator, NamespaceOperations.setProperty, or NamespaceOperations.modifyProperties) - Conflicts with default table iterators (if the table has them) - Adding the exact iterator already present should not fail This commit also adds a new IteratorConflictsIT to test all of the above. Part of apache#6030
Adds conflict checks to: - NamespaceOperations.attachIterator (was previously only checking for conflicts with iterators in the namespace, now also checks for conflicts with iterators in the tables of the namespace) - NamespaceOperations.setProperty (check conflicts with namespace iterators and all tables in the namespace) - NamespaceOperations.modifyProperties (check conflicts with namespace iterators and all tables in the namespace) New tests to IteratorConflictsIT to test the above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From running sunny day tests and all the tests I have changed in this PR, noticed that I unknowingly added new permission requirements to at least TableOperations.create() (new required permission ALTER_NAMESPACE) and Scanner.addScanIterator() (new required permission ALTER_TABLE). I imagine this is a blocker for these changes at this point, but let me know if it's not. I'll look into an alternative to avoid these permissions. See changes to ConditionalWriterIT, ScanIteratorIT, and ShellServerIT for examples of the failures I encountered.
Checks are now done server side as of cb2eccb, avoiding these permission requirements.
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
Outdated
Show resolved
Hide resolved
test/src/main/java/org/apache/accumulo/test/functional/IteratorConflictsIT.java
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/iteratorsImpl/IteratorConfigUtil.java
Outdated
Show resolved
Hide resolved
|
Transferring to WIP until I resolve #6040 (review) |
core/src/main/java/org/apache/accumulo/core/clientImpl/ClientContext.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/client/admin/NewTableConfiguration.java
Outdated
Show resolved
Hide resolved
|
Discussed iterator conflicts today, and here's a summary of some key points:
|
- Moves the iterator conflict check for create table from client side to server side. - Checking if iterators added to scanner conflict with those already set on the table moved from client side to server side. - Adds iterator conflict checks to CloneConfiguration.Builder.setPropertiesToSet. This check is done server side. - Adds testing to IteratorConflictsIT for CloneConfiguration.Builder.setPropertiesToSet
test/src/main/java/org/apache/accumulo/test/functional/IteratorConflictsIT.java
Outdated
Show resolved
Hide resolved
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java
Show resolved
Hide resolved
|
@kevinrr888 why did you choose to do this work in 2.1 instead of in main? Seems there is chance if introducing new bugs in scans or compactions. Also may make config that used to work stop working (that is probably a good thing overall as it can help detect existing problems, but could introduce temporary pain). I am not opposed to making this change in 2.1, but was just curious. |
@keith-turner I had already started this work in 2.1 with #5990 thinking this was a one-off issue with NewTableConfiguration. I did not anticipate follow on work requiring changes in as many areas, so continued with 2.1 learning the scope of the issue as I went. I also thought this validation would be good to have in the earliest version possible since it is essentially a bug. I would be fine refactoring this for main if we think this is too risky or undesired for 2.1. |
There are benefits and risk with this change. Maybe the best way to get the benefits and lower the risk is to make these changes only warn in 2.1 and fail in 4.0? That way things that were working in 2.1.4 and earlier do not blow up in 2.1.5, but still work and get a warning that iterator config is not correct and could lead to non-deterministic behavior. |
This is good with me, I'll change this |
Also fixed a bug where I was calling regex.matches(str) instead of str.matches(regex)
| } catch (NumberFormatException e) { | ||
| throw new AccumuloException("Bad value for existing iterator setting: " | ||
| + property.getKey() + "=" + property.getValue()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the new code is not translating this NumberFormatException, is this change in exceptions going to ripple through to the client API? When we parse the props to IteratorSetting we could maintain these exceptions in the new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this does change the exceptions the client might receive from this. They will always receive an AccumuloException before and after these changes, but they no longer receive AccumuloExceptions for two things: AccumuloExceptions if parts.length != 2 or the priority isn't a number. With my changes, I check properties against a regex (see new static vars in IteratorConfigUtil), ignoring those that don't match. So if parts.length != 2 or the priority isn't a number, it is quietly ignored as not being an iterator-related property.
I can add back AccumuloExceptions for these two cases if you think that is best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is quietly ignored as not being an iterator-related property.
We should not quietly ignore invalid properties under the iterator prefix, this could be a sign of a typo. If its a typo then the user could expect something to be working and it silently does not. Would be good to fail or warn for that. Made a comment elsewhere about the scope, seems this new code and the old code ignores invalid scopes. Not completely sure about this this though.
Co-authored-by: Keith Turner <kturner@apache.org>
|
I pushed 469a83e to have the newly added iterator conflict checks only log instead of throw in 2.1. In merging to main, all checks will throw. |
keith-turner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looking at this. Feel free to ignore some of the comments, can be follow on issues. I keep noticing existing problems when looking at this.
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java
Show resolved
Hide resolved
| EnumSet.of(IteratorScope.scan), Map.of(IteratorScope.scan, picIteratorSettings), | ||
| false); | ||
| } catch (AccumuloException e) { | ||
| throw new IllegalArgumentException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this fail the scan? If so should this warn? If we do warn does that mean this code is untestable in 2.1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this fail the scan? If so should this warn?
This method call will only log and not throw an AccumuloException (note false param). It is impossible for this to throw an AccumuloException, so I could change IllegalArgumentException to an assertion error or something to make this more clear that it can't happen.
This is not the case for all conflict check methods in IteratorConfigUtil, as some perform table ops, namespace ops, etc. so they can still throw an AccumuloException, but this specific method cannot throw an AccumuloException when false is provided. I considered writing different methods with different signatures (e.g., one that "throws AccumuloException" and one that does not), but that made things even more bloated in IteratorConfigUtil, and this approach couldn't be applied to all conflict check methods (since as I mentioned some still needed to throw an AccumuloException for other reasons).
If we do warn does that mean this code is untestable in 2.1?
I'm working on testing the warnings via checking the logs (working on changing IteratorConflictsIT), which is proving to be difficult, but I think it's possible
core/src/main/java/org/apache/accumulo/core/iteratorsImpl/IteratorConfigUtil.java
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| public static void checkIteratorConflicts(Map<String,String> props, IteratorSetting iterToCheck, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing this code and the exsiting code in this area, noticed there are multiple places that parse and serialize iter config. This code is adding to that existing pattern. Was wondering if we could avoid this for the new code and experimented to see what was possible. Came up w/ the following. Its very clunky, but avoids duplicating code to parse iterator config. Feel free to ignore this, was just trying to understand what we could do. Can do things in follow on issues, would like to narrow the set of changes for this rather than widen them.
public static List<IteratorSetting> propertiesToIteratorSettings(IteratorScope scope,AccumuloConfiguration config, Consumer<Map<String, Map<String, String>>> extraOptsConsumer){
Map<String, Map<String, String>> allOptions = new HashMap<>();
List<IterInfo> info = parseIterConf(scope, List.of(), allOptions, config);
List<IteratorSetting> iterSettings = new ArrayList<>(info.size());
info.forEach(iterInfo -> {
var options = allOptions.remove(iterInfo.getIterName());
var iterSetting =new IteratorSetting(iterInfo.getPriority(), iterInfo.getIterName(), iterInfo.getClassName());
if(options != null){
iterSetting.addOptions(options);
}
iterSettings.add(iterSetting);
});
if(!allOptions.isEmpty()) {
// these are itertor options w/ no iterator definition
extraOptsConsumer.accept(allOptions);
}
return iterSettings;
}
public static void checkIteratorConflicts(Map<String,String> props, IteratorSetting iterToCheck,
EnumSet<IteratorScope> iterScopesToCheck) throws AccumuloException {
// parse the props map
Map<IteratorScope,List<IteratorSetting>> existingIters =
new HashMap<>(IteratorScope.values().length);
for(var scope : iterScopesToCheck) {
var iterSettings = propertiesToIteratorSettings(scope, new ConfigurationCopy(props), extraOpts->{
// TODO this used throw an exception w/ the first extra option it saw,not its all extra so could be a much larger message which may cause problems for logging
String msg = String.format("iterator options conflict for %s : %s",
iterToCheck.getName(), extraOpts);
throw new AccumuloException(new IllegalArgumentException(msg));
});
existingIters.put(scope,iterSettings);
}
// check if the given iterator conflicts with any existing iterators
checkIteratorConflicts(iterToCheck, iterScopesToCheck, existingIters);
}This would be a follow on issue, but it would be nice to consolidate parsing/serializing in the existing code. Maybe its better address this comprehensively in a PR focused on this single problem rather than partially here for only new code, not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #6074
| Comparator.comparingInt(IterInfo::getPriority); | ||
|
|
||
| private static final String ITERATOR_PROP_REGEX = | ||
| ("^" + Property.TABLE_ITERATOR_PREFIX.getKey() + "(" + Arrays.stream(IteratorScope.values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way these regexes are used for validation they ignore properties with the iterator prefix that do not have a valid scope. Like maybe table.iterator.sacn.fooFilter would be completely ignored and would not flagged as a problem. Seems like the old validation code had this same problem, so this is not a new behavior AFAICT. This should probably be a follow on issue to look for invalid scopes following the iterator prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I could do is:
- if they don't match the regex, but they do begin with the table iterator prefix
table.iterator.I can throw an exception.
This will add new validation/new cases where an exception will be thrown (for example, previously only a couple things were validated about the iterator property like the string in the expected spot for priority had to be an int, and the property value had to be length 2 when split by ",").
If we go with (1), user code could now throw exceptions where they weren't before, but this is probably fine as their code is broken anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be done in follow on, not necessarily needed for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is covered by #6074. It suggest the centralized parsing code validate this.
| } catch (NumberFormatException e) { | ||
| throw new AccumuloException("Bad value for existing iterator setting: " | ||
| + property.getKey() + "=" + property.getValue()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is quietly ignored as not being an iterator-related property.
We should not quietly ignore invalid properties under the iterator prefix, this could be a sign of a typo. If its a typo then the user could expect something to be working and it silently does not. Would be good to fail or warn for that. Made a comment elsewhere about the scope, seems this new code and the old code ignores invalid scopes. Not completely sure about this this though.
server/manager/src/main/java/org/apache/accumulo/manager/FateServiceHandler.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/clientImpl/TableOperationsHelper.java
Show resolved
Hide resolved
…o logging warning
- Avoid two look up for some comparisons - Fix some bugs with the iterator conflict checks done on clone table Co-authored-by: Keith Turner <kturner@apache.org>
|
@keith-turner - I believe I addressed and/or responded to all non-follow on issues for this |
This adds checks when adding an iterator that the given iterator does not conflict with any existing iterators. Conflict meaning same name or same priority. Iterators can be added several ways, and previously only TableOperations.attachIterator and NamespaceOperations.attachIterator would check for conflicts. This adds iterator conflict checks to:
This also accounts for the several ways in which conflicts can arise:
This commit also adds a new IteratorConflictsIT to test all of the above.
Part of #6030