Skip to content
This repository was archived by the owner on Aug 26, 2022. It is now read-only.

Commit c52a774

Browse files
sebrandon1edcdavidgreyerofhamadiseshimritproj
authored
Upgrade 3.3.x with latest from main (#672)
* Fix timeout (#663) * Bug fix for the NodeSelector handler. (#664) + Some traces fixed in the networking ts. * Diagnostic mode fix: close debug oc sessions before removing node lab… (#666) * Diagnostic mode fix: close debug oc sessions before removing node labels. This commit tries to fix the crash that happens when running the container in diagnostic mode: race condition when the program finishes, as the oc debug session were not closed before the labels are being removed from the nodes. * Revert "Fix timeout (#663)" This reverts commit 8710512. Co-authored-by: Salaheddine Hamadi <[email protected]> * Timeout for debug pods increased to 5mins. (#669) + Retry interval also increased to 5 secs. * Container testing log level set to trace. (#665) Co-authored-by: Brandon Palm <[email protected]> * lifecycle's terminationGracePeriod TC removed. (#667) Due to the issues on both the original and current implementation, the terminationGracePeriod TC will be removed. A new and improved version of this TC should be done, checking that after the configured (or defaulted) value, the pods were removed without any error code. Co-authored-by: Brandon Palm <[email protected]> * Solution for issue on Nokia (#671) * Solution for issue on Nokia * Fixed the error Co-authored-by: Your Name <[email protected]> * Update Go to 1.17.9 (#668) * Fix missed version changes (#670) Co-authored-by: edcdavid <[email protected]> Co-authored-by: Gonzalo Reyero Ferreras <[email protected]> Co-authored-by: Salaheddine Hamadi <[email protected]> Co-authored-by: Shimrit Peretz <[email protected]> Co-authored-by: Your Name <[email protected]>
1 parent d2f531a commit c52a774

File tree

14 files changed

+51
-191
lines changed

14 files changed

+51
-191
lines changed

.github/workflows/merge.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ jobs:
1515
- name: Set up Go 1.17
1616
uses: actions/setup-go@v2
1717
with:
18-
go-version: 1.17.8
18+
go-version: 1.17.9
1919

2020
- name: Check out code into the Go module directory
2121
uses: actions/checkout@v2

.github/workflows/pre-main.yaml

+6-5
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ env:
1818
TNF_OUTPUT_DIR: /tmp/tnf/output
1919
TNF_SRC_URL: 'https://github.com/${{ github.repository }}'
2020
TESTING_CMD_PARAMS: '-n host -i ${REGISTRY_LOCAL}/${IMAGE_NAME}:${IMAGE_TAG} -t ${TNF_CONFIG_DIR} -o ${TNF_OUTPUT_DIR}'
21+
CONTAINER_DIAGNOSTIC_LOG_LEVEL: trace
2122
TNF_PARTNER_DIR: '/usr/tnf-partner'
2223
TNF_PARTNER_SRC_DIR: '${TNF_PARTNER_DIR}/src'
2324
TERM: xterm-color
@@ -31,7 +32,7 @@ jobs:
3132
- name: Set up Go 1.17
3233
uses: actions/setup-go@v2
3334
with:
34-
go-version: 1.17.8
35+
go-version: 1.17.9
3536

3637
- name: Disable default go problem matcher
3738
run: echo "::remove-matcher owner=go::"
@@ -82,7 +83,7 @@ jobs:
8283
- name: Set up Go 1.17
8384
uses: actions/setup-go@v2
8485
with:
85-
go-version: 1.17.8
86+
go-version: 1.17.9
8687

8788
- name: Disable default go problem matcher
8889
run: echo "::remove-matcher owner=go::"
@@ -127,7 +128,7 @@ jobs:
127128
- name: Set up Go 1.17
128129
uses: actions/setup-go@v2
129130
with:
130-
go-version: 1.17.8
131+
go-version: 1.17.9
131132

132133
- name: Disable default go problem matcher
133134
run: echo "::remove-matcher owner=go::"
@@ -141,7 +142,7 @@ jobs:
141142
run: go install github.com/golang/mock/[email protected] && make mocks
142143

143144
- name: Install ginkgo
144-
run: go install github.com/onsi/ginkgo/ginkgo@v1.16.5
145+
run: go install github.com/onsi/ginkgo/v2/ginkgo@v2.1.3
145146

146147
- name: Execute `make build`
147148
run: make build
@@ -199,7 +200,7 @@ jobs:
199200
shell: bash
200201

201202
- name: 'Test: Run without any TS, just get diagnostic information'
202-
run: ./run-tnf-container.sh ${{ env.TESTING_CMD_PARAMS }}
203+
run: LOG_LEVEL=${CONTAINER_DIAGNOSTIC_LOG_LEVEL} ./run-tnf-container.sh ${{ env.TESTING_CMD_PARAMS }}
203204

204205
- name: 'Test: Run generic test suite in a TNF container'
205206
run: ./run-tnf-container.sh ${{ env.TESTING_CMD_PARAMS }} -f access-control lifecycle platform observability networking affiliated-certification operator

CATALOG.md

-12
Original file line numberDiff line numberDiff line change
@@ -221,18 +221,6 @@ Description|http://test-network-function.com/testcases/lifecycle/pod-scheduling
221221
Result Type|informative
222222
Suggested Remediation|In most cases, Pod's should not specify their host Nodes through nodeSelector or nodeAffinity. However, there are cases in which CNFs require specialized hardware specific to a particular class of Node. As such, this test is purely informative, and will not prevent a CNF from being certified. However, one should have an appropriate justification as to why nodeSelector and/or nodeAffinity is utilized by a CNF.
223223
Best Practice Reference|[CNF Best Practice V1.2](https://connect.redhat.com/sites/default/files/2021-03/Cloud%20Native%20Network%20Function%20Requirements.pdf) Section 6.2
224-
#### pod-termination-grace-period
225-
226-
Property|Description
227-
---|---
228-
Test Case Name|pod-termination-grace-period
229-
Test Case Label|lifecycle-pod-termination-grace-period
230-
Unique ID|http://test-network-function.com/testcases/lifecycle/pod-termination-grace-period
231-
Version|v1.0.0
232-
Description|http://test-network-function.com/testcases/lifecycle/pod-termination-grace-period tests whether the terminationGracePeriod is CNF-specific, or if the default (30s) is utilized. This test is informative, and will not affect CNF Certification. In many cases, the default terminationGracePeriod is perfectly acceptable for a CNF.
233-
Result Type|informative
234-
Suggested Remediation|Choose a terminationGracePeriod that is appropriate for your given CNF. If the default (30s) is appropriate, then feel free to ignore this informative message. This test is meant to raise awareness around how Pods are terminated, and to suggest that a CNF is configured based on its requirements. In addition to a terminationGracePeriod, consider utilizing a termination hook in the case that your application requires special shutdown instructions.
235-
Best Practice Reference|[CNF Best Practice V1.2](https://connect.redhat.com/sites/default/files/2021-03/Cloud%20Native%20Network%20Function%20Requirements.pdf) Section 6.2
236224
#### readiness
237225

238226
Property|Description

Dockerfile

+3-3
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,12 @@ ENV TEMP_DIR=/tmp
1313

1414
# Install dependencies
1515
RUN yum install -y gcc git jq make wget
16-
RUN wget https://get.helm.sh/helm-v3.8.1-linux-amd64.tar.gz && \
17-
tar -xvf helm-v3.8.1-linux-amd64.tar.gz && \
16+
RUN wget https://get.helm.sh/helm-v3.8.2-linux-amd64.tar.gz && \
17+
tar -xvf helm-v3.8.2-linux-amd64.tar.gz && \
1818
cp linux-amd64/helm /usr/bin/helm
1919
# Install Go binary
2020
ENV GO_DL_URL="https://golang.org/dl"
21-
ENV GO_BIN_TAR="go1.17.8.linux-amd64.tar.gz"
21+
ENV GO_BIN_TAR="go1.17.9.linux-amd64.tar.gz"
2222
ENV GO_BIN_URL_x86_64=${GO_DL_URL}/${GO_BIN_TAR}
2323
ENV GOPATH="/root/go"
2424
RUN if [[ "$(uname -m)" -eq "x86_64" ]] ; then \

Makefile

+2-2
Original file line numberDiff line numberDiff line change
@@ -128,8 +128,8 @@ install-tools:
128128
go install github.com/onsi/ginkgo/v2/[email protected]
129129
go install github.com/onsi/gomega
130130
go install github.com/golang/mock/[email protected]
131-
wget https://get.helm.sh/helm-v3.8.1-linux-amd64.tar.gz && \
132-
tar -xvf helm-v3.8.1-linux-amd64.tar.gz && \
131+
wget https://get.helm.sh/helm-v3.8.2-linux-amd64.tar.gz && \
132+
tar -xvf helm-v3.8.2-linux-amd64.tar.gz && \
133133
cp linux-amd64/helm /usr/local/bin/helm
134134

135135
# Install golangci-lint

pkg/config/autodiscover/autodiscover_debug.go

+11-9
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,16 @@ import (
3131
)
3232

3333
const (
34-
defaultNamespace = "default"
35-
debugDaemonSet = "debug"
36-
debugLabelName = "test-network-function.com/app"
37-
debugLabelValue = "debug"
38-
nodeLabelName = "test-network-function.com/node"
39-
nodeLabelValue = "target"
40-
addlabelCommand = "oc label node %s %s=%s --overwrite=true"
41-
deletelabelCommand = "oc label node %s %s- --overwrite=true"
34+
defaultNamespace = "default"
35+
debugDaemonSet = "debug"
36+
debugLabelName = "test-network-function.com/app"
37+
debugLabelValue = "debug"
38+
nodeLabelName = "test-network-function.com/node"
39+
nodeLabelValue = "target"
40+
addlabelCommand = "oc label node %s %s=%s --overwrite=true"
41+
deletelabelCommand = "oc label node %s %s- --overwrite=true"
42+
dsTimeoutMins = 5
43+
dsRetryIntervalSecs = 5
4244
)
4345

4446
// FindDebugPods completes a `configsections.TestPartner.ContainersDebugList` from the current state of the cluster,
@@ -81,7 +83,7 @@ func CheckDebugDaemonset(expectedDebugPods int) {
8183
gomega.Eventually(func() bool {
8284
log.Debug("check debug daemonset status")
8385
return checkDebugPodsReadiness(expectedDebugPods)
84-
}, 60*time.Second, 2*time.Second).Should(gomega.Equal(true)) //nolint: gomnd
86+
}, dsTimeoutMins*time.Minute, dsRetryIntervalSecs*time.Second).Should(gomega.Equal(true))
8587
}
8688

8789
// checkDebugPodsReadiness helper function that returns true if the daemonset debug is deployed properly

pkg/tnf/handlers/nodeselector/nodeselector.go

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ func NewNodeSelector(timeout time.Duration, podName, podNamespace string) *NodeS
4040
return &NodeSelector{
4141
timeout: timeout,
4242
result: tnf.ERROR,
43-
args: []string{"oc", "-n", podNamespace, "get", "pods", podName, "-o", "custom-columns=nodeselector:.spec.nodeSelector,nodeaffinity:.spec.nodeAffinity"},
43+
args: []string{"oc", "-n", podNamespace, "get", "pods", podName, "-o", "custom-columns=nodeselector:.spec.nodeSelector,nodeaffinity:.spec.affinity.nodeAffinity"},
4444
}
4545
}
4646

test-network-function/common/suite.go

+10-6
Original file line numberDiff line numberDiff line change
@@ -31,12 +31,7 @@ func RemoveLabelsFromAllNodes() {
3131
}
3232
}
3333

34-
var _ = ginkgo.BeforeSuite(func() {
35-
})
36-
37-
var _ = ginkgo.AfterSuite(func() {
38-
// clean up added label to nodes
39-
log.Info("Clean up added labels to nodes")
34+
func RemoveDebugPods() {
4035
env = configpkg.GetTestEnvironment()
4136
env.LoadAndRefresh()
4237
for name, node := range env.NodesUnderTest {
@@ -46,4 +41,13 @@ var _ = ginkgo.AfterSuite(func() {
4641
node.DebugContainer.CloseOc()
4742
autodiscover.DeleteDebugLabel(name)
4843
}
44+
}
45+
46+
var _ = ginkgo.BeforeSuite(func() {
47+
})
48+
49+
var _ = ginkgo.AfterSuite(func() {
50+
// clean up added label to nodes
51+
log.Info("Clean up added labels to nodes")
52+
RemoveDebugPods()
4953
})

test-network-function/identifiers/identifiers.go

-19
Original file line numberDiff line numberDiff line change
@@ -101,11 +101,6 @@ var (
101101
Url: formTestURL(common.AccessControlTestKey, "namespace"),
102102
Version: versionOne,
103103
}
104-
// TestNonDefaultGracePeriodIdentifier tests best grace period practices.
105-
TestNonDefaultGracePeriodIdentifier = claim.Identifier{
106-
Url: formTestURL(common.LifecycleTestKey, "pod-termination-grace-period"),
107-
Version: versionOne,
108-
}
109104
// TestNonTaintedNodeKernelsIdentifier is the identifier for the test checking tainted nodes.
110105
TestNonTaintedNodeKernelsIdentifier = claim.Identifier{
111106
Url: formTestURL(common.PlatformAlterationTestKey, "tainted-node-kernel"),
@@ -402,20 +397,6 @@ tag. (2) It doesn't have any of the following prefixes: default, openshift-, ist
402397
BestPracticeReference: bestPracticeDocV1dot2URL + " Section 6.2, 16.3.8 & 16.3.9",
403398
},
404399

405-
TestNonDefaultGracePeriodIdentifier: {
406-
Identifier: TestNonDefaultGracePeriodIdentifier,
407-
Type: informativeResult,
408-
Remediation: `Choose a terminationGracePeriod that is appropriate for your given CNF. If the default (30s) is appropriate, then feel
409-
free to ignore this informative message. This test is meant to raise awareness around how Pods are terminated, and to
410-
suggest that a CNF is configured based on its requirements. In addition to a terminationGracePeriod, consider utilizing
411-
a termination hook in the case that your application requires special shutdown instructions.`,
412-
Description: formDescription(TestNonDefaultGracePeriodIdentifier,
413-
`tests whether the terminationGracePeriod is CNF-specific, or if the default (30s) is utilized. This test is
414-
informative, and will not affect CNF Certification. In many cases, the default terminationGracePeriod is perfectly
415-
acceptable for a CNF.`),
416-
BestPracticeReference: bestPracticeDocV1dot2URL + " Section 6.2",
417-
},
418-
419400
TestNonTaintedNodeKernelsIdentifier: {
420401
Identifier: TestNonTaintedNodeKernelsIdentifier,
421402
Type: normativeResult,

test-network-function/lifecycle/suite.go

+5-127
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,6 @@
1717
package lifecycle
1818

1919
import (
20-
"encoding/json"
2120
"fmt"
2221
"path"
2322
"strings"
@@ -44,12 +43,11 @@ import (
4443
)
4544

4645
const (
47-
defaultTerminationGracePeriod = 30
48-
baseNodeDrainTimeout = 5 * time.Minute
49-
maxNodeDrainTimeout = 30 * time.Minute
50-
scalingTimeout = 1 * time.Minute
51-
scalingPollingPeriod = 1 * time.Second
52-
postNodeDrainRecoveryTimeOut = 2 * time.Minute
46+
baseNodeDrainTimeout = 5 * time.Minute
47+
maxNodeDrainTimeout = 30 * time.Minute
48+
scalingTimeout = 1 * time.Minute
49+
scalingPollingPeriod = 1 * time.Second
50+
postNodeDrainRecoveryTimeOut = 2 * time.Minute
5351
)
5452

5553
var (
@@ -127,8 +125,6 @@ var _ = ginkgo.Describe(common.LifecycleTestKey, func() {
127125

128126
testNodeSelector(env)
129127

130-
testGracePeriod(env)
131-
132128
testShutdown(env)
133129

134130
testLiveness(env)
@@ -332,124 +328,6 @@ func testNodeSelector(env *config.TestEnvironment) {
332328
})
333329
}
334330

335-
func testTerminationGracePeriodOnPodSet(podsetsUnderTests []configsections.PodSet, context *interactive.Context) []configsections.PodSet {
336-
const ocCommandTemplate = "oc get %s %s -n %s -o jsonpath={.metadata.annotations\\.\"kubectl\\.kubernetes\\.io/last-applied-configuration\"}"
337-
338-
type lastAppliedConfigType struct {
339-
Spec struct {
340-
Template struct {
341-
Spec struct {
342-
TerminationGracePeriodSeconds int
343-
}
344-
}
345-
}
346-
}
347-
348-
badPodsets := []configsections.PodSet{}
349-
for _, podset := range podsetsUnderTests {
350-
ocCommand := fmt.Sprintf(ocCommandTemplate, podset.Type, podset.Name, podset.Namespace)
351-
lastAppliedConfigString, err := utils.ExecuteCommand(ocCommand, common.DefaultTimeout, context)
352-
if err != nil {
353-
ginkgo.Fail(fmt.Sprintf("%s %s (ns %s): failed to get last-applied-configuration field", podset.Type, podset.Name, podset.Namespace))
354-
}
355-
lastAppliedConfig := lastAppliedConfigType{}
356-
357-
// Use -1 as default value, in case the param was not set.
358-
lastAppliedConfig.Spec.Template.Spec.TerminationGracePeriodSeconds = -1
359-
360-
err = json.Unmarshal([]byte(lastAppliedConfigString), &lastAppliedConfig)
361-
if err != nil {
362-
ginkgo.Fail(fmt.Sprintf("%s %s (ns %s): failed to unmarshall last-applied-configuration string (%s)", podset.Type, podset.Name, podset.Namespace, lastAppliedConfigString))
363-
}
364-
365-
if lastAppliedConfig.Spec.Template.Spec.TerminationGracePeriodSeconds == -1 {
366-
tnf.ClaimFilePrintf("%s %s (ns %s) template's spec does not have a terminationGracePeriodSeconds value set. Default value (%d) will be used.",
367-
podset.Type, podset.Name, podset.Namespace, defaultTerminationGracePeriod)
368-
badPodsets = append(badPodsets, podset)
369-
} else {
370-
log.Infof("%s %s (ns %s) last-applied-configuration's terminationGracePeriodSeconds: %d", podset.Type, podset.Name, podset.Namespace, lastAppliedConfig.Spec.Template.Spec.TerminationGracePeriodSeconds)
371-
}
372-
}
373-
374-
return badPodsets
375-
}
376-
377-
func testTerminationGracePeriodOnPods(pods []*configsections.Pod, context *interactive.Context) []configsections.Pod {
378-
const ocCommandTemplate = "oc get pod %s -n %s -o jsonpath={.metadata.annotations\\.\"kubectl\\.kubernetes\\.io/last-applied-configuration\"}"
379-
380-
type lastAppliedConfigType struct {
381-
Spec struct {
382-
TerminationGracePeriodSeconds int
383-
}
384-
}
385-
386-
badPods := []configsections.Pod{}
387-
numUnmanagedPods := 0
388-
for _, pod := range pods {
389-
// We'll process only "unmanaged" pods (not belonging to any deployment/statefulset) here.
390-
if pod.IsManaged {
391-
continue
392-
}
393-
394-
numUnmanagedPods++
395-
396-
ocCommand := fmt.Sprintf(ocCommandTemplate, pod.Name, pod.Namespace)
397-
lastAppliedConfigString, err := utils.ExecuteCommand(ocCommand, common.DefaultTimeout, context)
398-
if err != nil {
399-
ginkgo.Fail(fmt.Sprintf("Pod %s (ns %s): failed to get last-applied-configuration field", pod.Name, pod.Namespace))
400-
}
401-
lastAppliedConfig := lastAppliedConfigType{}
402-
403-
// Use -1 as default value, in case the param was not set.
404-
lastAppliedConfig.Spec.TerminationGracePeriodSeconds = -1
405-
406-
err = json.Unmarshal([]byte(lastAppliedConfigString), &lastAppliedConfig)
407-
if err != nil {
408-
ginkgo.Fail(fmt.Sprintf("Pod %s (ns %s): failed to unmarshall last-applied-configuration string (%s)", pod.Name, pod.Namespace, lastAppliedConfigString))
409-
}
410-
411-
if lastAppliedConfig.Spec.TerminationGracePeriodSeconds == -1 {
412-
tnf.ClaimFilePrintf("Pod %s (ns %s) spec does not have a terminationGracePeriodSeconds value set. Default value (%d) will be used.",
413-
pod.Name, pod.Namespace, defaultTerminationGracePeriod)
414-
badPods = append(badPods, *pod)
415-
} else {
416-
log.Infof("Pod %s (ns %s) last-applied-configuration's terminationGracePeriodSeconds: %d", pod.Name, pod.Namespace, lastAppliedConfig.Spec.TerminationGracePeriodSeconds)
417-
}
418-
419-
log.Debugf("Number of unamanaged pods processed: %d", numUnmanagedPods)
420-
}
421-
return badPods
422-
}
423-
424-
func testGracePeriod(env *config.TestEnvironment) {
425-
testID := identifiers.XformToGinkgoItIdentifier(identifiers.TestNonDefaultGracePeriodIdentifier)
426-
ginkgo.It(testID, ginkgo.Label(testID), func() {
427-
ginkgo.By("Test terminationGracePeriod")
428-
context := env.GetLocalShellContext()
429-
430-
badDeployments := testTerminationGracePeriodOnPodSet(env.DeploymentsUnderTest, context)
431-
badStatefulsets := testTerminationGracePeriodOnPodSet(env.StateFulSetUnderTest, context)
432-
badPods := testTerminationGracePeriodOnPods(env.PodsUnderTest, context)
433-
434-
numDeps := len(badDeployments)
435-
if numDeps > 0 {
436-
log.Debugf("Deployments found without terminationGracePeriodSeconds param set: %+v", badDeployments)
437-
}
438-
numSts := len(badStatefulsets)
439-
if numSts > 0 {
440-
log.Debugf("Statefulsets found without terminationGracePeriodSeconds param set: %+v", badStatefulsets)
441-
}
442-
numPods := len(badPods)
443-
if numPods > 0 {
444-
log.Debugf("Pods found without terminationGracePeriodSeconds param set: %+v", badPods)
445-
}
446-
447-
if numDeps > 0 || numSts > 0 || numPods > 0 {
448-
ginkgo.Fail(fmt.Sprintf("Found %d deployments, %d statefulsets and %d pods without terminationGracePeriodSeconds param set.", numDeps, numSts, numPods))
449-
}
450-
})
451-
}
452-
453331
//nolint:dupl
454332
func testShutdown(env *config.TestEnvironment) {
455333
testID := identifiers.XformToGinkgoItIdentifier(identifiers.TestShudtownIdentifier)

0 commit comments

Comments
 (0)