Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1,461 changes: 1,041 additions & 420 deletions aci-preupgrade-validation-script.py

Large diffs are not rendered by default.

5,880 changes: 5,880 additions & 0 deletions admin@10.31.125.151

Large diffs are not rendered by default.

67 changes: 64 additions & 3 deletions docs/docs/validations.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Items | This Script
[6.0(2)+ requires 32 and 64 bit switch images][g16] | :white_check_mark: | :no_entry_sign:
[Fabric Link Redundancy][g17] | :white_check_mark: | :no_entry_sign:
[APIC Database Size][g18] | :white_check_mark: | :no_entry_sign:
[APIC downgrade compatibility when crossing 6.2 release][g19]| :white_check_mark: | :no_entry_sign:

[g1]: #compatibility-target-aci-version
[g2]: #compatibility-cimc-version
Expand All @@ -55,6 +56,7 @@ Items | This Script
[g16]: #602-requires-32-and-64-bit-switch-images
[g17]: #fabric-link-redundancy
[g18]: #apic-database-size
[g19]: #apic-downgrade-compatibility-when-crossing-62-release

### Fault Checks
Items | Faults | This Script | APIC built-in
Expand All @@ -68,7 +70,7 @@ Items | Faults | This Script
[L3 Port Config][f7] | F0467: port-configured-as-l2 | :white_check_mark: | :white_check_mark: 5.2(4d)
[L2 Port Config][f8] | F0467: port-configured-as-l3 | :white_check_mark: | :white_check_mark: 5.2(4d)
[Access (Untagged) Port Config][f9] | F0467: native-or-untagged-encap-failure | :white_check_mark: | :no_entry_sign:
[Encap Already in Use][f10] | F0467: encap-already-in-use | :white_check_mark: | :no_entry_sign: | :no_entry_sign:
[Encap Already in Use][f10] | F0467: encap-already-in-use | :white_check_mark: | :no_entry_sign:
[L3Out Subnets][f11] | F0467: prefix-entry-already-in-use | :white_check_mark: | :white_check_mark: 6.0(1g)
[BD Subnets][f12] | F0469: duplicate-subnets-within-ctx | :white_check_mark: | :white_check_mark: 5.2(4d)
[BD Subnets][f13] | F1425: subnet-overlap | :white_check_mark: | :white_check_mark: 5.2(4d)
Expand All @@ -79,7 +81,7 @@ Items | Faults | This Script
[Scalability (faults related to Capacity Dashboard)][f18] | TCA faults for eqptcapacityEntity | :white_check_mark: | :no_entry_sign:
[Fabric Port Status][f19] | F1394: ethpm-if-port-down-fabric | :white_check_mark: | :no_entry_sign:
[Equipment Disk Limits][f20] | F1820: 80% -minor<br>F1821: -major<br>F1822: -critical | :white_check_mark: | :no_entry_sign:

[VMM Inventory Partially Synced][f21] | F0132: comp-ctrlr-operational-issues | :white_check_mark: | :no_entry_sign:


[f1]: #apic-disk-space-usage
Expand All @@ -102,7 +104,7 @@ Items | Faults | This Script
[f18]: #scalability-faults-related-to-capacity-dashboard
[f19]: #fabric-port-status
[f20]: #equipment-disk-limits

[f21]: #vmm-inventory-partially-synced

### Configuration Checks

Expand Down Expand Up @@ -191,6 +193,8 @@ Items | Defect | This Script
[Stale pconsRA Object][d26] | CSCwp22212 | :warning:{title="Deprecated"} | :no_entry_sign:
[ISIS DTEPs Byte Size][d27] | CSCwp15375 | :white_check_mark: | :no_entry_sign:
[Policydist configpushShardCont Crash][d28] | CSCwp95515 | :white_check_mark: |
[Service-EP Flag in BD without PBR][d29] | CSCwi17652 | :white_check_mark: | :no_entry_sign:


[d1]: #ep-announce-compatibility
[d2]: #eventmgr-db-size-defect-susceptibility
Expand Down Expand Up @@ -220,6 +224,7 @@ Items | Defect | This Script
[d26]: #stale-pconsra-object
[d27]: #isis-dteps-byte-size
[d28]: #policydist-configpushshardcont-crash
[d29]: #service-ep-flag-in-bd-without-pbr


## General Check Details
Expand Down Expand Up @@ -495,6 +500,38 @@ For current version is 6.1(3f):
In either scenario, contact TAC to collect a database dump of the flagged DME(s) and shard(s) for further analysis.


### APIC downgrade compatibility when crossing 6.2 release

APIC 6.2(1) release introduces significant optimizations to the APIC upgrade process, including shorter upgrade time and an orchestrated workflow across the cluster with fewer failure points. This release includes an architecture change on APIC, so APIC running 6.2(1) or newer (e.g., 6.2(1a)) cannot be downgraded to any pre-6.2(1) version (e.g., 6.1(4h)).

Upgrading from pre-6.2(1) to 6.2(1)+ is supported; however, rollback (downgrade) after such an upgrade is not possible.

This check alerts you if you are crossing the 6.2 boundary, beyond which downgrade compatibility is lost. No additional user action is required.

!!! note
Switch upgrade architecture hasn't been changed in 6.2(1)/16.2(1). The limitation of downgrade compatibility between pre-/post-6.2(1) versions is only for APIC.

!!! example
These are examples for upgrade/downgrade paths to show which downgrade compatibility is lost.

Upgrade:

* 6.1(4) -> **6.2(1)**: Supported
* **6.2(1)** -> 6.2(2): Supported

Downgrade:

* **6.2(1)** -> 6.1(4): Not Supported !!! - The API request gets rejected on APIC.
* 6.2(2) -> **6.2(1)**: Supported

Note that this is just one example. See [APIC Upgrade/Downgrade Matrix][2] for the full list of supported version combinations.

!!! tip
Make sure to collect the latest configuration backup before you upgrade your APICs from pre-6.2(1) to 6.2(1)+ so that Cisco TAC can perform the fabric recovery process in the case of emergency where you need to downgrade your APICs to the previous version (i.e. 6.2(1)+ -> pre-6.2(1)).

If it's for a lab environment, you can initialize the fabric and perform a fresh ISO installation of pre-6.2(1) on APICs.


## Fault Check Details

### APIC Disk Space Usage
Expand Down Expand Up @@ -1506,6 +1543,16 @@ To recover from this fault, try the following action
userdom : all
```

### VMM Inventory Partially Synced

This script checks for fault code F0132 with rule comp-ctrlr-operational-issues and change set `partial-inv`. This fault is raised when APICs report a partially synchronized inventory with vCenter servers.

EPGs using the `immediate` or `on-demand` resolution immediacy (this is typical) rely on the VMM Inventory to determine VLAN programming. If the known inventory changes during an upgrade and the APIC is reporting its last sync to be partial, a VMM inventory resync response with inventory changes could result in VLANs being unexpectedly removed.

EPGs using the `pre-provision` resolution immediacy do not rely on the VMM inventory for VLAN deployment and so unexpected inventory changes will not change vlan programmings.

This check returns a `MANUAL` result as there are many reasons for a partial inventory sync to be reported. The goal is to ensure that the VMM inventory sync has fully completed before triggering the APIC upgrade to reduce any chance for unexpected inventory changes to occur.

## Configuration Check Details

### VPC-paired Leaf switches
Expand Down Expand Up @@ -2604,6 +2651,19 @@ Due to [CSCwp95515][59], upgrading to an affected version while having any `conf
If any instances of `configpushShardCont` are flagged by this script, Cisco TAC must be contacted to identify and resolve the underlying issue before performing the upgrade.


### Service-EP Flag in BD without PBR

On ACI releases 5.2.5c/6.0.1g and 16.0.8e/6.1.1f, the Service-ep flag is set on the Service epg (vlanCktEp) even when PBR (vnsRsLIfCtxToSvcRedirectPol) is not configured.
The service-ep ctrl setting configures the Don't Learn (DL) Bit to 1 when forwarding the traffic to destination.
The DL bit being set on traffic coming from service device causes more BUM traffic on customer network.

When customers upgrade to a version >= 16.0.8e/6.1.1f, due to the fix of [CSCwi17652][62] the Service-ep flag gets removed for the specific service EPGs vlanCktEp without PBR

This may affect working service graphs. If any instances of missing `vnsRsLIfCtxToSvcRedirectPol` are flagged by this script, Cisco TAC must be contacted to identify and resolve any underlying issue before performing the upgrade.




[0]: https://github.com/datacenter/ACI-Pre-Upgrade-Validation-Script
[1]: https://www.cisco.com/c/dam/en/us/td/docs/Website/datacenter/apicmatrix/index.html
[2]: https://www.cisco.com/c/en/us/support/switches/nexus-9000-series-switches/products-release-notes-list.html
Expand Down Expand Up @@ -2666,3 +2726,4 @@ If any instances of `configpushShardCont` are flagged by this script, Cisco TAC
[59]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwp95515
[60]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#Inter
[61]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#EnablePolicyCompression
[62]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwi17652
2 changes: 1 addition & 1 deletion pytest.ini
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[pytest]
log_cli = true
log_cli_level = DEBUG
log_cli_format = [%(asctime)s.%(msecs)03d %(levelname)-8s %(funcName)20s:%(lineno)-4d] %(message)s
log_cli_format = [%(asctime)s.%(msecs)03d %(levelname)-8s %(funcName)s:%(lineno)-4d(%(threadName)s)] %(message)s
Loading