High CPU usage and "TCP in Livelock" in Event Logs on Edge SWG
search cancel

High CPU usage and "TCP in Livelock" in Event Logs on Edge SWG

book

Article ID: 166918

calendar_today

Updated On:

Products

ProxySG Software - SGOS

Issue/Introduction

The Edge SWG (formerly ProxySG) performing poorly with high CPU utilization and "TCP in Livelock" messages in the Event Log.

Cause

Livelocks will happen when an interface becomes so saturated with packets that the Edge SWG is unable to keep up. For the interface to become saturated, it takes more than a high volume of legitimate connections, it usually involves a network loop. For example, if a policy forwards traffic from proxy 'A' to proxy 'B', and proxy 'B' is configured to forward to proxy 'A', that will generate a loop that will most likely cause one of the interfaces to go in livelock mode until traffic quiets down.

A routing loop can also be the cause of a livelock. This is far more likely to happen when the proxy is deployed transparently inline on the network. If the proxy is installed between two redundant switches and spanning tree is disabled, it could create a network loop. Other possible causes could be Denial-of-Service attacks (ping floods for example).

To best way to troubleshoot a livelock issue is to take a packet capture and look for symptoms. Here are a few common symptoms

  • Lots of duplicate packets (SYN packets are seen more than once) are a good indication that there is a loop in the network
  • Lots of SYN packets from the same source IP address, on many different destination ports usually indicate that a denial of service attack
  • Lots of HTTP connections from the same source IP that seems to keep authenticating repeatedly can mean that a workstation is configured to ignore cookies. This would cause the ProxySG to keep authenticating the same connection non-stop and go into a loop that can result in a livelock (as well as high CPU usage)

Resolution

If you have a deployment where a child Edge SWG is forwarding all requests to a parent Edge SWG then you may be encountering a forwarding loop.

The child ProxySG may have policy similar to the following:

 forward(parent_proxy) forward.fail_open(no)

This means all requests will be forwarded to the parent ProxySG.

You may encounter a forwarding loop if a parent ProxySG (or upstream client) sends a request to the child ProxySG because it will be immediately forwarded to the parent, which sends it back to the child, which is sent back to the parent and the process repeats. The symptoms are extreme slowness, high CPU and possibly TCP LiveLock messages in the Event Log.

Policy suggestions to prevent this:

  • Only forward requests if they come from IP addresses on the child ProxySG network.
  • Consider creating a policy to Deny connections from any parent proxies.