Skip to content

feat: run aks node controller at boot time faster by 30s#8082

Draft
awesomenix wants to merge 1 commit intomainfrom
nishp/nocse
Draft

feat: run aks node controller at boot time faster by 30s#8082
awesomenix wants to merge 1 commit intomainfrom
nishp/nocse

Conversation

@awesomenix
Copy link
Contributor

@awesomenix awesomenix commented Mar 12, 2026

Summary

  • move scriptless AKS node controller startup earlier by switching generated custom data to cloud-boothook
  • Savings of 30s
  • update the baked aks-node-controller.service ordering to match the earlier-start model while keeping the VHD enable path intact
  • update the e2e hack path to mirror the same boothook-driven startup pattern

Details

  • aks-node-controller/pkg/nodeconfigutils/utils.go now writes the config from boothook and starts aks-node-controller.service immediately
  • parts/linux/cloud-init/artifacts/aks-node-controller.service now waits on network-online.target and stays active (exited) after the one-shot run
  • e2e/vmss.go switches the hack flow from runcmd to a boothook-dropped service and wrapper
  • generate-testdata was run to refresh generated snapshot data impacted by the pkg change

Timings

Latest rerun timings: 16CPU, 32GB RAM

   - boot → CSE start: 13.000s
   - CSE start: +0.000s
   - configureKubeletAndKubectl done: +1.627s
    - installKubeletKubectlFromURL: 5ms
   - ensureContainerd done: +2.137s
   - ensureKubelet done: +4.548s
   - ensureNoDupOnPromiscuBridge done: +7.934s
   - configureNodeExporter done: +8.768s
   - CSE finish: +10.874s
   - containerd started: +7.000s
   - kubelet started: +7.000s
   - first kubelet log: +7.000s
   - runtime initialized: +12.000s
   - main sync loop: +12.000s
   - node registered: +12.000s
   - NodeReady: +13.000s


   Latest rerun: 2CPU, 8GB RAM

   - boot → CSE start: 16.000s
   - CSE start: +0.000s
   - configureKubeletAndKubectl done: +2.910s
    - installKubeletKubectlFromURL: 7ms
   - ensureContainerd done: +3.677s
   - ensureKubelet done: +6.281s
   - ensureNoDupOnPromiscuBridge done: +12.761s
   - configureNodeExporter done: +14.115s
   - CSE finish: +17.789s
   - containerd started: +12.000s
   - kubelet started: +12.000s
   - runtime initialized: +20.000s
   - node registered: +20.000s
   - NodeReady: +21.000s

Copilot AI review requested due to automatic review settings March 12, 2026 02:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts the e2e VMSS provisioning flow to run the aks-node-controller hack in the foreground (synchronously) and alters how provisioning status is validated during scenario setup.

Changes:

  • Run /opt/azure/bin/aks-node-controller-hack provision ... synchronously in cloud-init instead of backgrounding it.
  • Stop setting the VMSS CustomScript commandToExecute when provisioning via AKSNodeConfig (commented out).
  • Disable the post-create Custom Script Extension status check (commented out).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
e2e/vmss.go Runs aks-node-controller-hack provision synchronously; comments out CSE command wiring when using AKSNodeConfig.
e2e/test_helpers.go Comments out the VMSS Custom Script Extension status validation after VMSS creation.
Comments suppressed due to low confidence (1)

e2e/vmss.go:158

  • cse is no longer set when s.Runtime.AKSNodeConfig != nil. In the DisableScriptLessCompilation path this results in generating CustomData that only writes the aks-node-controller config file, but does not execute aks-node-controller (the VMSS CustomScript extension is skipped because cseCmd is empty). This will prevent the node from being provisioned. Restore wiring so scriptless mode still executes /opt/azure/containers/aks-node-controller provision-wait (or alternatively ensure CustomDataWithHack runs the equivalent of provision-wait). Note: leaving cse empty can also lead to a nil dereference in getBaseVMSSModel for Windows, which assumes an ExtensionProfile exists.
	if s.Runtime.AKSNodeConfig != nil {
		//cse = nodeconfigutils.CSE
		customData = func() string {
			if config.Config.DisableScriptLessCompilation {
				data, err := nodeconfigutils.CustomData(s.Runtime.AKSNodeConfig)
				require.NoError(s.T, err, "failed to generate custom data from AKSNodeConfig")

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +304 to +305
// err = getCustomScriptExtensionStatus(s, scenarioVM.VM)
// require.NoError(s.T, err)
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These commented-out lines disable getCustomScriptExtensionStatus validation entirely. That removes early detection of CSE failures (non-zero exit code / provisioning state) and disables saving Windows CSE output to the scenario log directory, making failures harder to debug and potentially letting scenarios proceed past provisioning errors. Consider re-enabling this check with a retry/polling loop (or gating it on whether a CustomScript extension was configured for the VMSS).

Copilot generated this review using guidance from repository custom instructions.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +47 to +48
runcmd:
- ` + CSE + `
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CustomData() now injects a #cloud-config part that runs the CSE command via runcmd (/opt/azure/containers/aks-node-controller provision-wait). That means cloud-init will execute provision-wait during boot, which is a behavior change from “CustomData writes config; CSE runs provision-wait” and may block cloud-init completion until provisioning finishes (and potentially duplicate the command if a caller still runs CSE via the CustomScript extension). If the intent is only to write config + start the service earlier, consider removing this runcmd and keeping CSE as a separate command for callers/VM extension.

Suggested change
runcmd:
- ` + CSE + `

Copilot uses AI. Check for mistakes.
Comment on lines 162 to 166
return &aksnodeconfigv1.Configuration{
Version: "v1",
BootstrappingConfig: bootstrappingConfig,
DisableCustomData: nbc.AgentPoolProfile.IsFlatcar() || nbc.AgentPoolProfile.IsACL(),
DisableCustomData: true,
LinuxAdminUsername: "azureuser",
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding DisableCustomData: true changes behavior from the previous Flatcar/ACL-only disable. If this field is meant to reflect whether custom data should be used for bootstrapping, setting it true for all converted configs may make AKSNodeConfig e2e scenarios diverge from real configurations and reduce coverage of the custom-data-enabled path. Consider restoring the previous conditional (or add a brief rationale if the field is intentionally always true for AKSNodeConfig e2e).

Copilot uses AI. Check for mistakes.
Key components:

1. `aks-node-controller.service`: systemd unit that is triggered once cloud-init is complete (guaranteeing that config is present on disk) and then kickstarts bootstrapping.
1. `aks-node-controller.service`: systemd unit that can be started directly by cloud-boothook as soon as the config file is written, while remaining enabled on the VHD as a fallback boot hook.
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README states aks-node-controller.service remains enabled on the VHD as a fallback boot hook, but this PR’s unit file change removes the [Install] section (so systemctl enable aks-node-controller.service fails during VHD build). Either update this doc to match the new enable/start model, or restore an enable-able unit definition so the fallback claim is accurate.

Suggested change
1. `aks-node-controller.service`: systemd unit that can be started directly by cloud-boothook as soon as the config file is written, while remaining enabled on the VHD as a fallback boot hook.
1. `aks-node-controller.service`: systemd unit that is started directly by cloud-boothook as soon as the config file is written; it is started explicitly by the provisioning flow rather than being persistently enabled on the VHD as a fallback boot hook.

Copilot uses AI. Check for mistakes.
@awesomenix awesomenix changed the title feat: run aks node controller in sync feat: run aks node controller at boot time Mar 12, 2026
@awesomenix awesomenix changed the title feat: run aks node controller at boot time feat: run aks node controller at boot time faster by 30s Mar 12, 2026
@awesomenix awesomenix force-pushed the nishp/nocse branch 2 times, most recently from c874fa5 to 0ac573b Compare March 13, 2026 08:01
Copilot AI review requested due to automatic review settings March 13, 2026 08:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 84 changed files in this pull request and generated 7 comments.


You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 162 to 230
@@ -226,7 +226,7 @@ func nbcToAKSNodeConfigV1(nbc *datamodel.NodeBootstrappingConfiguration) *aksnod
NoProxyEntries: *nbc.HTTPProxyConfig.NoProxy,
},
LocalDnsProfile: &aksnodeconfigv1.LocalDnsProfile{
EnableLocalDns: true,
EnableLocalDns: false,
CpuLimitInMilliCores: to.Ptr(int32(2008)),
Comment on lines 73 to +77
// localdns is not supported on scriptless, privatekube and VHDUbuntu2204Gen2ContainerdNetworkIsolatedK8sNotCached.
if !s.VHD.UnsupportedLocalDns {
ValidateLocalDNSService(ctx, s, "enabled")
ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
}
// if !s.VHD.UnsupportedLocalDns {
// ValidateLocalDNSService(ctx, s, "enabled")
// ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
// }
Comment on lines 58 to 60
function basePrep {
logs_to_events "AKS.CSE.aptmarkWALinuxAgent" aptmarkWALinuxAgent hold &
# logs_to_events "AKS.CSE.aptmarkWALinuxAgent" aptmarkWALinuxAgent hold &

Comment on lines +38 to +42
RemainAfterExit=yes
EOF

systemctl daemon-reload
systemctl enable aks-node-controller.service
require.NoError(t, err)
require.True(t, strings.HasPrefix(string(cloudConfig), "#cloud-config\n"))
require.Contains(t, string(cloudConfig), "runcmd:")
require.Contains(t, string(cloudConfig), CSE)
Comment on lines +152 to +153
systemctl disable --now containerd

Comment on lines +850 to +856
if ! cachePrivateKubeBinariesFromTarball "${cached_pkg}" "${k8s_version_from_url}"; then
rm -f "${cached_pkg}"
echo "failed to extract private kube binaries from ${cached_pkg}"
exit $ERR_PRIVATE_K8S_PKG_ERR
fi

rm -f "${cached_pkg}"
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 85 changed files in this pull request and generated 13 comments.


You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 73 to +77
// localdns is not supported on scriptless, privatekube and VHDUbuntu2204Gen2ContainerdNetworkIsolatedK8sNotCached.
if !s.VHD.UnsupportedLocalDns {
ValidateLocalDNSService(ctx, s, "enabled")
ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
}
// if !s.VHD.UnsupportedLocalDns {
// ValidateLocalDNSService(ctx, s, "enabled")
// ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
// }
Comment on lines 854 to 855
rm -f "${cached_pkg}"
}
Comment on lines +41 to +42
systemctl daemon-reload
systemctl enable aks-node-controller.service
Version: "v1",
BootstrappingConfig: bootstrappingConfig,
DisableCustomData: nbc.AgentPoolProfile.IsFlatcar() || nbc.AgentPoolProfile.IsACL(),
DisableCustomData: true,
Comment on lines +501 to +505
@@ -502,7 +502,7 @@ function nodePrep {
systemctl restart --no-block apt-daily.service

fi
aptmarkWALinuxAgent unhold &
# aptmarkWALinuxAgent unhold &
Comment on lines 824 to 825
cached_pkg="${K8S_PRIVATE_PACKAGES_CACHE_DIR}/${k8s_version_from_url}.tar.gz"
echo "download private package ${kube_private_binary_url} and store as ${cached_pkg}"
[Service]
Type=oneshot
ExecStart=/opt/azure/containers/aks-node-controller-wrapper.sh
RemainAfterExit=yes
Comment on lines 848 to 852
if ! cachePrivateKubeBinariesFromTarball "${cached_pkg}" "${k8s_version_from_url}"; then
rm -f "${cached_pkg}"
echo "failed to extract private kube binaries from ${cached_pkg}"
exit $ERR_PRIVATE_K8S_PKG_ERR
fi
},
LocalDnsProfile: &aksnodeconfigv1.LocalDnsProfile{
EnableLocalDns: true,
EnableLocalDns: false,
local k8s_tgz_name
k8s_tgz_name=$(echo "$kube_private_binary_url" | grep -o -P '(?<=\/kubernetes\/).*(?=\/binaries\/)').tar.gz
# Save the package with a version-derived name temporarily, then cache extracted binaries.
k8s_version_from_url=$(echo "$kube_private_binary_url" | grep -o -P '(?<=\/kubernetes\/).*(?=\/binaries\/)' | head -n1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure if mutiple, using head -n1 is safe ?

@@ -797,14 +797,19 @@ capture_benchmark "${SCRIPT_NAME}_configure_telemetry"
# if it is a kube-proxy package, extract image from the downloaded package
cacheKubePackageFromPrivateUrl() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is used for Embargo only ? we build a VHD with a private package on demand only ?

# Users may add custom configurations or pull additional container images after this stage.
function basePrep {
logs_to_events "AKS.CSE.aptmarkWALinuxAgent" aptmarkWALinuxAgent hold &
# logs_to_events "AKS.CSE.aptmarkWALinuxAgent" aptmarkWALinuxAgent hold &
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not safe ? of you saying it's safe since we run outside of an extension so we don't care if waagent gets upgraded during apt-get upgrade ?

Copilot AI review requested due to automatic review settings March 13, 2026 19:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 90 changed files in this pull request and generated 6 comments.


You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +465 to +469
logs_to_events "AKS.CSE.ensureKubelet" ensureKubelet

if [ "${ENSURE_NO_DUPE_PROMISCUOUS_BRIDGE}" = "true" ]; then
logs_to_events "AKS.CSE.ensureNoDupOnPromiscuBridge" ensureNoDupOnPromiscuBridge
fi
Comment on lines +465 to +475
logs_to_events "AKS.CSE.ensureKubelet" ensureKubelet

if [ "${ENSURE_NO_DUPE_PROMISCUOUS_BRIDGE}" = "true" ]; then
logs_to_events "AKS.CSE.ensureNoDupOnPromiscuBridge" ensureNoDupOnPromiscuBridge
fi

# This is to enable localdns using scriptless.
if [ "${SHOULD_ENABLE_LOCALDNS}" = "true" ]; then
logs_to_events "AKS.CSE.enableLocalDNS" enableLocalDNS || exit $ERR_LOCALDNS_FAIL
fi

Comment on lines +28 to +44
cat <<'EOF' >/etc/systemd/system/aks-node-controller.service
[Unit]
Description=Parse contract and run csecmd
ConditionPathExists=/opt/azure/containers/aks-node-controller-config.json
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/opt/azure/containers/aks-node-controller-wrapper.sh
RemainAfterExit=yes
EOF

systemctl daemon-reload
systemctl enable aks-node-controller.service
`

Comment on lines 848 to 852
if ! cachePrivateKubeBinariesFromTarball "${cached_pkg}" "${k8s_version_from_url}"; then
rm -f "${cached_pkg}"
echo "failed to extract private kube binaries from ${cached_pkg}"
exit $ERR_PRIVATE_K8S_PKG_ERR
fi
Comment on lines +74 to +77
// if !s.VHD.UnsupportedLocalDns {
// ValidateLocalDNSService(ctx, s, "enabled")
// ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
// }
},
LocalDnsProfile: &aksnodeconfigv1.LocalDnsProfile{
EnableLocalDns: true,
EnableLocalDns: false,
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 87 changed files in this pull request and generated 5 comments.


You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +38 to +42
RemainAfterExit=yes
EOF

systemctl daemon-reload
systemctl enable aks-node-controller.service
with open(cloud_config_path, 'r') as f:
cloud_config_data = f.read()
cloud_config = yaml.safe_load(cloud_config_data)
cloud_config = yaml.safe_load(cloud_config_data) or {}
Comment on lines +153 to +156
if ! isAzureLinuxOSGuard "$OS" "$OS_VARIANT" && ! command -v containerd >/dev/null 2>&1; then
logs_to_events "AKS.CSE.installContainerRuntime" installContainerRuntime
else
echo "Skipping installContainerRuntime because containerd is already available"
retrycmd_if_failure 120 5 25 sysctl --system || exit $ERR_SYSCTL_RELOAD
systemctlEnableAndStart containerd 30 || exit $ERR_SYSTEMCTL_START_FAIL
retrycmd_if_failure 120 5 25 sysctl -p /etc/sysctl.d/99-force-bridge-forward.conf || exit $ERR_SYSCTL_RELOAD
systemctlEnableAndStartNoBlock containerd 30 || exit $ERR_SYSTEMCTL_START_FAIL
Comment on lines +74 to +77
// if !s.VHD.UnsupportedLocalDns {
// ValidateLocalDNSService(ctx, s, "enabled")
// ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
// }
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 89 changed files in this pull request and generated 9 comments.


You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +157 to +160
if ! isAzureLinuxOSGuard "$OS" "$OS_VARIANT" && ! command -v containerd >/dev/null 2>&1; then
logs_to_events "AKS.CSE.installContainerRuntime" installContainerRuntime
else
echo "Skipping installContainerRuntime because containerd is already available"
fi
aptmarkWALinuxAgent unhold &
if [ "${SKIP_WALA_HOLD}" = "true" ]; then
echo "Skipping holding walinuxagent"
Comment on lines +3 to +4
systemctl daemon-reload
systemctl disable --now containerd || exit 1
Comment on lines +3 to +4
After=network-online.target
Wants=network-online.target
retrycmd_if_failure 120 5 25 sysctl --system || exit $ERR_SYSCTL_RELOAD
systemctlEnableAndStart containerd 30 || exit $ERR_SYSTEMCTL_START_FAIL
retrycmd_if_failure 120 5 25 sysctl -p /etc/sysctl.d/99-force-bridge-forward.conf || exit $ERR_SYSCTL_RELOAD
systemctlEnableAndStartNoBlock containerd 30 || exit $ERR_SYSTEMCTL_START_FAIL
"SERVICE_ACCOUNT_IMAGE_PULL_DEFAULT_TENANT_ID": config.GetServiceAccountImagePullProfile().GetDefaultTenantId(),
"IDENTITY_BINDINGS_LOCAL_AUTHORITY_SNI": config.GetServiceAccountImagePullProfile().GetLocalAuthoritySni(),
"CSE_TIMEOUT": getCSETimeout(config),
"SKIP_WALA_HOLD": "true",
Comment on lines +74 to +77
// if !s.VHD.UnsupportedLocalDns {
// ValidateLocalDNSService(ctx, s, "enabled")
// ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
// }
cat <<'EOF' | base64 -d >/opt/azure/containers/aks-node-controller-config.json
%s
EOF
chmod 0644 /opt/azure/containers/aks-node-controller-config.json
Comment on lines 1 to 4
echo $(date),$(hostname) > ${PROVISION_OUTPUT};
{{if not .GetDisableCustomData}}
CLOUD_INIT_STATUS_SCRIPT="/opt/azure/containers/cloud-init-status-check.sh";
cloudInitExitCode=0;
if [ -f "${CLOUD_INIT_STATUS_SCRIPT}" ]; then
/bin/bash -c "source ${CLOUD_INIT_STATUS_SCRIPT}; handleCloudInitStatus \"${PROVISION_OUTPUT}\"; returnStatus=\$?; echo \"Cloud init status check exit code: \$returnStatus\" >> ${PROVISION_OUTPUT}; exit \$returnStatus" >> ${PROVISION_OUTPUT} 2>&1;
else
cloud-init status --wait > /dev/null 2>&1;
fi;
cloudInitExitCode=$?;
if [ "$cloudInitExitCode" -eq 0 ]; then
echo "cloud-init succeeded" >> ${PROVISION_OUTPUT};
else
echo "cloud-init failed with exit code ${cloudInitExitCode}" >> ${PROVISION_OUTPUT};
cat ${PROVISION_OUTPUT}
exit ${cloudInitExitCode};
fi;
{{end}}
{{if getIsAksCustomCloud .CustomCloudConfig}}
REPO_DEPOT_ENDPOINT="{{.CustomCloudConfig.RepoDepotEndpoint}}"
{{getInitAKSCustomCloudFilepath}} >> /var/log/azure/cluster-provision.log 2>&1;
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 90 changed files in this pull request and generated 11 comments.


You can also share your feedback on Copilot code review. Take the survey.

ExecStartPre=-/sbin/iptables -t nat --numeric --list

ExecStartPre=/bin/bash /opt/azure/containers/validate-kubelet-credentials.sh
ExecStartPre=/bin/sh -c 'until [ -S /run/containerd/containerd.sock ]; do sleep 0.1; done'
Comment on lines +1 to +10
[Unit]
Description=Parse contract and run csecmd
ConditionPathExists=/opt/azure/containers/aks-node-controller-config.json
After=cloud-init.target
After=oem-cloudinit.service enable-oem-cloudinit.service
Wants=cloud-init.target

[Service]
Type=oneshot
ExecStart=/opt/azure/containers/aks-node-controller-wrapper.sh
RemainAfterExit=No
RemainAfterExit=yes

[Install]
WantedBy=cloud-init.target
WantedBy=oem-cloudinit.service
WantedBy=basic.target
fi
aptmarkWALinuxAgent unhold &
if [ "${SKIP_WALA_HOLD}" = "true" ]; then
echo "Skipping holding walinuxagent"
mv "/opt/bin/kubelet-${KUBERNETES_VERSION}" /opt/bin/kubelet
mv "/opt/bin/kubectl-${KUBERNETES_VERSION}" /opt/bin/kubectl

chmod a+x /opt/bin/kubelet /opt/bin/kubectl
EOF
retrycmd_if_failure 120 5 25 sysctl --system || exit $ERR_SYSCTL_RELOAD
systemctlEnableAndStart containerd 30 || exit $ERR_SYSTEMCTL_START_FAIL
retrycmd_if_failure 120 5 25 sysctl -p /etc/sysctl.d/99-force-bridge-forward.conf || exit $ERR_SYSCTL_RELOAD
Comment on lines +74 to +77
// if !s.VHD.UnsupportedLocalDns {
// ValidateLocalDNSService(ctx, s, "enabled")
// ValidateLocalDNSResolution(ctx, s, "169.254.10.10")
// }
"SERVICE_ACCOUNT_IMAGE_PULL_DEFAULT_TENANT_ID": config.GetServiceAccountImagePullProfile().GetDefaultTenantId(),
"IDENTITY_BINDINGS_LOCAL_AUTHORITY_SNI": config.GetServiceAccountImagePullProfile().GetLocalAuthoritySni(),
"CSE_TIMEOUT": getCSETimeout(config),
"SKIP_WALA_HOLD": "true",
Comment on lines +212 to +239
func (a *App) materializeProvisionConfigFromOVF(ctx context.Context, provisionConfigPath string) error {
boothook, err := extractBoothookFromOVFEnvFile(ovfEnvFilePath)
if err != nil {
return err
}

scriptFile, err := os.CreateTemp("", "aks-node-controller-boothook-*.sh")
if err != nil {
return fmt.Errorf("create boothook temp file: %w", err)
}
scriptPath := scriptFile.Name()
defer os.Remove(scriptPath)

if err := scriptFile.Close(); err != nil {
return fmt.Errorf("close boothook temp file %s: %w", scriptPath, err)
}
if err := os.WriteFile(scriptPath, []byte(boothook), 0600); err != nil {
return fmt.Errorf("write boothook temp file %s: %w", scriptPath, err)
}

cmd := exec.CommandContext(ctx, "/bin/bash", scriptPath)
var stdoutBuf, stderrBuf bytes.Buffer
cmd.Stdout = &stdoutBuf
cmd.Stderr = &stderrBuf
if err := a.cmdRun(cmd); err != nil {
return fmt.Errorf("execute boothook from %s: %w (stdout=%q stderr=%q)", ovfEnvFilePath, err, stdoutBuf.String(), stderrBuf.String())
}

Comment on lines +212 to +239
func (a *App) materializeProvisionConfigFromOVF(ctx context.Context, provisionConfigPath string) error {
boothook, err := extractBoothookFromOVFEnvFile(ovfEnvFilePath)
if err != nil {
return err
}

scriptFile, err := os.CreateTemp("", "aks-node-controller-boothook-*.sh")
if err != nil {
return fmt.Errorf("create boothook temp file: %w", err)
}
scriptPath := scriptFile.Name()
defer os.Remove(scriptPath)

if err := scriptFile.Close(); err != nil {
return fmt.Errorf("close boothook temp file %s: %w", scriptPath, err)
}
if err := os.WriteFile(scriptPath, []byte(boothook), 0600); err != nil {
return fmt.Errorf("write boothook temp file %s: %w", scriptPath, err)
}

cmd := exec.CommandContext(ctx, "/bin/bash", scriptPath)
var stdoutBuf, stderrBuf bytes.Buffer
cmd.Stdout = &stdoutBuf
cmd.Stderr = &stderrBuf
if err := a.cmdRun(cmd); err != nil {
return fmt.Errorf("execute boothook from %s: %w (stdout=%q stderr=%q)", ovfEnvFilePath, err, stdoutBuf.String(), stderrBuf.String())
}

Comment on lines +236 to +238
if err := a.cmdRun(cmd); err != nil {
return fmt.Errorf("execute boothook from %s: %w (stdout=%q stderr=%q)", ovfEnvFilePath, err, stdoutBuf.String(), stderrBuf.String())
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants