Posts

Cisco Nexus syslog logging server custom port

So it seems when Cisco worked on the nexus operating system they decided for whatever reason not to initially allow custom port for the syslog server on IOS its quite simples as you can see below:

This is quite annoying when you have infrastructure setup using non-default ports for syslog, on the cisco nexus 7000 series versions <= 6.1 do not allow you to use custom logging server port. There is a ray of hope if you are using versions >=7.x and nexus 7k then this is fixed.

Cisco feature request to change syslog destination port:
https://tools.cisco.com/bugsearch/bug/CSCug55348

From the looks of it seems that this is only supported on the nexus 7k in our situation it was easier to accommodate the default syslog port on the logging infrastructure as it seems cisco does NOT support custom syslog destination port on the cisco nexus 3000 series. If this changes at all or someone knows any different please leave a comment.

cisco nexus bcm_usd_isr_switch_event_cb error cause bit 0x10020a8

Recently we found the following issue on our nexus switches running code 5.0(3)U5(1a), the problem would be that the switch would not be able to reach a destination address even if its in the arp table, cisco commented on this with the below

From case notes I understand there are the following messages in log buffer on N3k switch and you need assistance investigating this

————————–
2013 Apr 10 05:37:06 TOR-2053a.LHR7 %USER-3-SYSTEM_MSG: bcm_usd_isr_switch_event_cb:432: slot_num 0, event 2, memory error type 0x1, mem addr 0x8ab4, cause bit 0x10020a8 – bcm_usd
2013 Apr 10 05:48:28 TOR-2053a.LHR7 %USER-3-SYSTEM_MSG: bcm_usd_isr_switch_event_cb:432: slot_num 0, event 2, memory error type 0x1, mem addr 0x8ab4, cause bit 0x10020a8 – bcm_usd
2013 Apr 10 05:52:01 TOR-2053a.LHR7 %USER-3-SYSTEM_MSG: bcm_usd_isr_switch_event_cb:432: slot_num 0, event 2, memory error type 0x1, mem addr 0x8ab4, cause bit 0x10020a8 – bcm_usd
2013 Apr 10 05:52:41 TOR-2053a.LHR7 %USER-3-SYSTEM_MSG: bcm_usd_isr_switch_event_cb:432: slot_num 0, event 2, memory error type 0x1, mem addr 0x8ab4, cause bit 0x10020a8 – bcm_usd
————————–

From the above log we can see that cause bit is 0x10020a8 – this is translated to ‘L2_ENTRY_PAR_ERR’ reason.

Impact:
There is a parity error in the MAC address table, which is not ECC protected. This parity error will result in packet being dropped.

Recommended Action:
If the switch is running 5.0(3)U5(1a) or earlier versions, it is recommended to upgrade to 6.0(2)U2(1) or later version. Your switch runs 5.0(3)U4(1).
This error is automatically correct after 6.0(2)U2(1) by running consistency checker.

Parity errors are not a bug

Background

What is a processor or memory parity error?

Parity checking is the storage of an extra binary digit (bit) in order to represent the parity (odd or even) of a small amount of computer data (typically one byte) while that data is stored in memory. The parity value calculated from the stored data is then compared to the final parity value. If these two values differ, this indicates a data error, and at least one bit must have been changed due to data corruption.

Within a computer system, electrical or magnetic interference from internal or external causes can cause a single bit of memory to spontaneously flip to the opposite state. This event makes the original data bits invalid and is known as a parity error.

Such memory errors, if undetected, may have undetectable and inconsequential results or may cause permanent corruption of stored data or a machine crash.

There are many causes of memory parity errors, which are classified as either soft parity errors or hard parity errors.

Soft Errors

Most parity errors are caused by electrostatic or magnetic-related environmental conditions.

The majority of single-event errors in memory chips are caused by background radiation (such as neutrons from cosmic rays), electromagnetic interference (EMI), or electrostatic discharge (ESD). These events may randomly change the electrical state of one or more memory cells or may interfere with the circuitry used to read and write memory cells.

Known as soft parity errors, these events are typically transient or random and usually occur once. Soft errors can be minor or severe:

Minor soft errors that can be corrected without component reset are single event upsets (SEUs).
Severe soft errors that require a component or system reset are single event latchups (SELs).
Soft errors are not caused by hardware malfunction; they are transient and infrequent, are mostly likely a SEU, and are caused by an environmental disruption of the memory data.

If you encounter soft parity errors, analyze recent environmental changes that have occurred at the location of the affected system. Common sources of ESD and EMI that may cause soft parity errors include:

Power cables and supplies
Power distribution units
Universal power supplies
Lighting systems
Power generators
Nuclear facilities (radiation)
Solar flares (radiation)

Hard Errors

Other parity errors are caused by a physical malfunction of the memory hardware or by the circuitry used to read and write memory cells.

Hardware manufacturers take extensive measures to prevent and test for hardware defects. However, defects are still possible; for example, if any of the memory cells used to store data bits are malformed, they may be unable to hold a charge or may be more vulnerable to environmental conditions.

Similarly, while the memory itself may be operating normally, any physical or electrical damage to the circuitry used to read and write memory cells may also cause data bits to be changed during transfer, which results in a parity error.

Known as hard parity errors, these events are typically very frequent and repeated and occur whenever the affected memory or circuitry is used. The exact frequency depends on the extent of the malfunction and how frequently the damaged equipment is used.

Remember that hard parity errors are the result of a hardware malfunction and reoccur whenever the affected component is used.

If you encounter hard parity errors, analyze physical changes that have occurred at the location of the affected system. Common sources of hardware malfunction that may lead to hard parity errors include:

Power surges (no ground)
ESD
Overheating or cooling
Incorrect or partial installation
Component incompatibility
Manufacturing defect