Skip to content

Kubernetes Operator for Nessie#7967

Open
adutra wants to merge 1 commit intoprojectnessie:mainfrom
adutra:kubernetes-operator
Open

Kubernetes Operator for Nessie#7967
adutra wants to merge 1 commit intoprojectnessie:mainfrom
adutra:kubernetes-operator

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Jan 19, 2024

No description provided.

@adutra adutra force-pushed the kubernetes-operator branch from c528161 to d6d439d Compare January 19, 2024 14:24
@adutra adutra force-pushed the kubernetes-operator branch 10 times, most recently from bd00481 to 02e994a Compare January 25, 2024 20:21
@adutra adutra marked this pull request as ready for review January 25, 2024 20:24
@adutra adutra changed the title WIP: Kubernetes Operator for Nessie Kubernetes Operator for Nessie Jan 25, 2024
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really reviewing yet - just adding comments "randomly".

@adutra adutra force-pushed the kubernetes-operator branch 6 times, most recently from de7990f to 68051e7 Compare January 30, 2024 17:45
@adutra adutra force-pushed the kubernetes-operator branch 5 times, most recently from 4a2d6f7 to 544c574 Compare February 12, 2024 16:56
@adutra adutra force-pushed the kubernetes-operator branch from c39a06f to 3073462 Compare March 22, 2024 09:05
@adutra adutra force-pushed the kubernetes-operator branch 2 times, most recently from 7fea756 to 30bfae6 Compare April 3, 2024 12:40
@adutra adutra force-pushed the kubernetes-operator branch 8 times, most recently from 422f674 to 2a7f761 Compare April 22, 2024 12:14
@adutra adutra force-pushed the kubernetes-operator branch from 2a7f761 to 27df175 Compare April 30, 2024 14:40
@adutra adutra force-pushed the kubernetes-operator branch 3 times, most recently from 348e44c to 91283d0 Compare May 6, 2024 17:25
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another batch of review comments.

filter(ReplaceTokens::class, mapOf("tokens" to mapOf("projectVersion" to project.version)))
}

tasks.named("quarkusAppPartsBuild").configure {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know where these messages during the build come from?

Invalid AnsiLogger Stream -> Swapping to default sdt out logger.
[WARN] Could not detect project version. Using 'latest'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import org.projectnessie.operator.reconciler.nessie.resource.options.AutoscalingOptions;

@KubernetesDependent(labelSelector = NessieReconciler.DEPENDENT_RESOURCES_SELECTOR)
public class HorizontalPodAutoscalerV2Beta1Dependent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three HorizontalPodAutoScaler* classes look quite similar, rather equal. Any chance to unify the common/similar code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Difficult because the classes look similar but share no common code in fabric8's code. However we can ask ourselves if it's still useful to support the beta variants, they have been all deprecated for a while now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷

@@ -0,0 +1,149 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./gradlew :nessie-operator:test prints a lot to the console.

Some are warnings:

ismatched Quarkus version found: "3.10.0", expected: "3.8.3" by at least a minor version and things might not work as expected.
Mismatched Quarkus-provided Fabric8 Kubernetes Client version found: "6.11.0", expected: "6.10.0" by at least a minor version and things might not work as expected.
Build time property cannot be changed at runtime:
 - quarkus.tls.trust-all is set to 'true' but it is build time fixed to 'false'. Did you change the property quarkus.tls.trust-all after building the application?

Any chance to not let the exception be printed to the console?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got this during the bigtable IT:

Unable to mount a file from test host into a running container. This may be a misconfiguration or limitation of your Docker environment. Some features might not work.

and then this output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that needs to be configured for podman?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance to not let the exception be printed to the console?

The first line comes from here:

https://github.com/quarkiverse/quarkus-operator-sdk/blob/b9f9ccb9a88c31947e169a2ca5d163e5921cdbf8/core/deployment/src/main/java/io/quarkiverse/operatorsdk/deployment/VersionAlignmentCheckingStep.java#L49

The "build time property cannot be changed" message comes probably from some dev service, but I couldn't find which one. See also quarkusio/quarkus#23680.

I don't know how to suppress these messages since they are logged during build, not during tests.

Got this during the bigtable IT:

Is this happening all the time, or randomly? I've never seen this. I will have to install podman to investigate. It's not related to BigTableIT, but rather to the start of the K3s container before the tests.

Copy link

@rsvihladremio rsvihladremio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the CRDs, reviewed some of the code and this is a good start for an operator. The operator framework is a decent fit and cuts down on the amount of logic to write and lifecycle to manage.

Note, I have not used it in anger, so I don't have any comments on the usability aspect beyond being able to say the CRDs are well formed and logica, in otherwords, I know what I'm configuring just by reading them.

Also, I think you may find that you will ultimately get a request for an operator UI as what you see here https://github.com/zalando/postgres-operator/blob/master/docs/operator-ui.md

Final comment, generally speaking, it does seem helm charts still remain very popular, so you will have to maintain both this and a helm chart, if there are not resources and you have to pick one I would pick the helm chart, but you will have users that prefer the operator, or even demand it.

@dorsegal
Copy link

I went through the CRDs, reviewed some of the code and this is a good start for an operator. The operator framework is a decent fit and cuts down on the amount of logic to write and lifecycle to manage.

Note, I have not used it in anger, so I don't have any comments on the usability aspect beyond being able to say the CRDs are well formed and logica, in otherwords, I know what I'm configuring just by reading them.

Also, I think you may find that you will ultimately get a request for an operator UI as what you see here https://github.com/zalando/postgres-operator/blob/master/docs/operator-ui.md

Final comment, generally speaking, it does seem helm charts still remain very popular, so you will have to maintain both this and a helm chart, if there are not resources and you have to pick one I would pick the helm chart, but you will have users that prefer the operator, or even demand it.

Note about the helm chart, I feel we (devops) want to use operators if we can but want to do it using regular helm charts. There are a lot of operators support installation + configuration using helm chart.

@adutra
Copy link
Contributor Author

adutra commented Jul 19, 2024

@dorsegal our plan is to distribute the operator using Helm charts initially. Operator Hub would be the ultimate goal, but that requires a lot more work.

intTestImplementation(project(":nessie-keycloak-testcontainer"))
intTestImplementation(project(":nessie-container-spec-helper"))

intTestCompileOnly(libs.microprofile.openapi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
intTestCompileOnly(libs.microprofile.openapi)
intTestCompileOnly(libs.microprofile.openapi)
intTestCompileOnly(libs.immutables.value.annotations)

# Version is managed by Renovate - do not edit.
# See https://cloud.google.com/sdk/docs/downloads-docker#docker_image_options
# Use debian_component_based because it supports linux/arm
FROM gcr.io/google.com/cloudsdktool/google-cloud-cli:483.0.0-debian_component_based
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM gcr.io/google.com/cloudsdktool/google-cloud-cli:483.0.0-debian_component_based
FROM gcr.io/google.com/cloudsdktool/google-cloud-cli:484.0.0-debian_component_based

@@ -0,0 +1,3 @@
# Dockerfile to provide the image name and tag to a test.
# Version is managed by Renovate - do not edit.
FROM rancher/k3s:v1.30.2-k3s2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM rancher/k3s:v1.30.2-k3s2
FROM docker.io/rancher/k3s:v1.30.2-k3s2

@@ -0,0 +1,3 @@
# Dockerfile to provide the image name and tag to a test.
# Version is managed by Renovate - do not edit.
FROM mongo:7.0.12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM mongo:7.0.12
FROM docker.io/mongo:7.0.12

@@ -0,0 +1,3 @@
# Dockerfile to provide the image name and tag to a test.
# Version is managed by Renovate - do not edit.
FROM postgres:16.3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM postgres:16.3
FROM docker.io/postgres:16.3

job-name: 'int-test-quarkus'
java-version: ${{ matrix.java-version }}

int-test-operator:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads-up - when it's merged.

import org.projectnessie.operator.reconciler.nessie.resource.options.AutoscalingOptions;

@KubernetesDependent(labelSelector = NessieReconciler.DEPENDENT_RESOURCES_SELECTOR)
public class HorizontalPodAutoscalerV2Beta1Dependent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷

container = createContainer();
container
.withNetwork(Network.SHARED)
.withLogConsumer(new Slf4jLogConsumer(logger))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.withLogConsumer(new Slf4jLogConsumer(logger))
.withLogConsumer(new Slf4jLogConsumer(logger).withPrefix(container.getDockerImageName()))

implements QuarkusTestResourceLifecycleManager {

protected C container;
protected String inDockerIpAddress;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected String inDockerIpAddress;
private String inDockerIpAddress;
protected String inDockerIpAddress() {
if (inDockerIpAddress == null) {
inDockerIpAddress = getInDockerIpAddress();
}
return inDockerIpAddress;
}

Comment on lines +48 to +50
inDockerIpAddress =
Objects.requireNonNull(
getInDockerIpAddress(), "could not determine container's in-docker IP address");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inDockerIpAddress =
Objects.requireNonNull(
getInDockerIpAddress(), "could not determine container's in-docker IP address");


@Override
public Map<String, String> start() {
Logger logger = LoggerFactory.getLogger(getClass());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Logger logger = LoggerFactory.getLogger(getClass());
inDockerIpAddress = null;
Logger logger = LoggerFactory.getLogger(getClass());

@snazy
Copy link
Member

snazy commented Jul 22, 2024

Still can't get the integration tests to work. K3s doesn't work properly in rootless. There seem to be some ways to get that working - but currently hitting this error from k3s.

expected sysctl value \"net.ipv4.ip_forward\" to be \"1\", got \"0\"; try adding \"net.ipv4.ip_forward=1\" to /etc/sysctl.conf and running `sudo sysctl --system`

@adutra
Copy link
Contributor Author

adutra commented Jul 22, 2024

Still can't get the integration tests to work. K3s doesn't work properly in rootless.

Yeah I'm not sure running a Kubernetes cluster inside a rootless container is even possible.

// Mitigate eviction issues in CI by setting eviction thresholds for nodefs very low
commandParts.add("--kubelet-arg=eviction-hard=nodefs.available<1%,nodefs.inodesFree<1%");
// Enable rootless containers
commandParts.add("--kubelet-arg=feature-gates=KubeletInUserNamespace=true");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
commandParts.add("--kubelet-arg=feature-gates=KubeletInUserNamespace=true");
commandParts.add("--kubelet-arg=feature-gates=KubeletInUserNamespace=true");
commandParts.add("--rootless");
commandParts.add("--snapshotter=fuse-overlayfs");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K3s refuses to start for me:

2024-07-22 10:21:14 time="2024-07-22T08:21:14Z" level=warning msg="Running RootlessKit as the root user is unsupported."
2024-07-22 10:21:14 time="2024-07-22T08:21:14Z" level=warning msg="The host root filesystem is mounted as \"master:34\". Setting child propagation to \"\" is not supported."
2024-07-22 10:21:14 time="2024-07-22T08:21:14Z" level=fatal msg="failed to setup UID/GID map: failed to compute uid/gid map: open /etc/subuid: no such file or directory"

@dorsegal
Copy link

dorsegal commented Sep 4, 2024

Do you plan to have GC implemented as part of the operator? I think it can be useful to have CRD for it

@adutra
Copy link
Contributor Author

adutra commented Sep 4, 2024

Do you plan to have GC implemented as part of the operator? I think it can be useful to have CRD for it

See here: #9415

So it's planned yes, but no ETA at the moment due to lack of interest from the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants