High CPU on Cisco 4500 with MSFT NLB multicast cluster

Recently, we were alerted to higher than normal CPU on some of our core Cisco Catalyst 4507 switches running IOS 12.2. Using Cisco’s CPU troubleshooting doc, I was able to narrow down the source to the Cat4K Mgmt LoPri process. From there, issuing a “sh platform health” command found it was the K2CpuMan Review process meaning packets are being forward by the CPU. To find out which queue, we issued the “sh platform cpu packet statistics” command. That showed the L3 Fwd Queue was much higher than normal.

By creating a CPU span and monitoring the traffic with Network Monitor 3.3, we could see that all of the traffic destined for VIP’s in our 2008 NLB clusters was hitting the CPU. I checked the configuration to ensure it matched the Catalyst and MSFT NLB example on Cisco’s site which it did. We were using multicast NLB configuration as explained in the document. I setup a test NLB cluster to play with the settings to figure out why cluster bound packets were hitting the CPU. What I found was in relation to this section:

However, since the incoming packets have a unicast destination IP address and multicast destination MAC the Cisco device ignores this entry and process-switches each cluster-bound packets. In order to avoid this process switching, insert a static mac-address-table entry as given below in order to switch cluster-bound packets in hardware.

mac-address-table static 0300.5e11.1111 vlan 200 interface fa2/3 fa2/4

Note: For Cisco Catalyst 6000/6500 series switches, you must add the disable-snopping parameter. For example:

mac-address-table static 0300.5e11.1111 vlan 200 interface fa2/3 fa2/4 disable-snooping

The disable-snooping parameter is essential and applicable only for Cisco Catalyst 6000/6500 series switches. Without this statement, the behavior is not affected.

I double and triple checked that our switches had the satic mac entry for the CAM tables and they did. So, I reconfigured my test cluster from the ground up and found that cluster bound packets only hit the cpu AFTER this command was entered. By removing this command from my switches for our production, CPU dropped 30-40% instantly. This seems to contradict what Cisco has posted in their example.

There was no adverse affect or downtime from removing this command. Both cluster nodes are connected locally to the switch however, and this command may be necessary if a NLB node is connected to a down-level swtich. Furthermore, a “sh int stats” is showing that no packets are switched by the “processor.”

Port Trunking in Cisco IOS

Port trunking is the process by which ports are designated as common uplink ports to carry traffic from multiple VLANs across the same physical cable. In the following example, we enable trunking on an etherchannel group to carry specific traffic from VLANs 1 through 99. We also assign the channel to a vlan we’ve designated specifically for etherchannels.

Router# configure terminal
Router (config)# interface port-channel 1
Router (config-if)# switchport access vlan 100
Router (config-if)# switchport trunk encapsulation dot1q
Router (config-if)# switchport trunk allowed vlan 1-99
Router (config-if)# switchport mode trunk
Router (config-if)# ^Z

This configuration will carry traffic for vlans 1-99 across channel group 1. Setting the trunk encapsualtion type is only available on switches that support multiple encapsualtion types. Ensure that spanning-tree is on in order to prevent loops.

Port Aggregation in Cisco IOS

Port aggregation (in the cisco world known as channelling) combines two ports for increased bandwidth and link redundancy for connecting switches together.

Router# configure terminal
Router (config)# interface range Gi0/45-46
Router (config-if)# channel-group 1 mode desirable
Router (config-if)# ^Z

This will link ports 45 and 46 (example is on a 2960 switch) together and assign them to group 1. By setting the link mode to desirable, the ports negotiate the proper PAgP protocol. This example configures a group a of ports at the same time. You can always configure each port individually by using the standard “interface Gi0/45” command. Once ports are assigned to a channel group, you can add configuration options to all ports by adding them to the channel. To configure the channel, use the following command: “interface port-channel 1”. The channel group number is unique on the switch. Both switches need to be configured the same, though the channel number does not need to match. When choosing which ports to use for channeling, be sure to choose ports not sharing switch bandwidth to ensure maximum throughput. Ensure that spanning-tree is on in order to prevent loops.