Sitecore X-Forwarded-For handling

A Sitecore solution is typically behind one or several reverse proxies, such as load balancers, content delivery networks etc. From a Content Delivery server perspective, the remote address, i.e. “the visible client IP” is the closes proxy instead of the IP of the connecting client. To solve this, the chain of proxies adds a http header with the IP address it’s communicating with. This header is typically called X-Forwarded-For or X-Real-IP.

Below is an example of such setup. Each proxy adds the IP they’re receiving the connection from:



Note: A corporate proxy typically don’t disclose their internal IP, so the a.a.a.a record in this example is not very common, but having it here makes it easier to follow.

From the example picture above, the Nginx Reverse Proxy closest to the Content Delivery Server would be seen with the RemoteAddress variable. It’ll add the load balancer IP at the end of the X-Forwarded-For header (d.d.d.d). The load balancer adds the CDN endpoint (c.c.c.c) and the CDN edge adds the client IP or the proxy it’s communicating with (b.b.b.b). The a.a.a.a IP is typically not seen, but could be there if a visitor is accessing the site via a proxy.

In Sitecore we use the XForwardedFor processor in the createVisit pipeline to resolve the public IP address for the client. This processor has an optional HeaderIpIndex in order to specify the number of proxies you have chained before facing the Internet. So in this setup, the HeaderIpIndex should be set to 2, i.e. the third IP from the end in X-Forwarded-For IP list.

Now, there’s currently a bug in the XForwardedFor class provided by Sitecore in at least version 9.0 update 1 and update 2. It probably applies to more versions. The processor counts the IP array from the start instead of the end. This means it “works” when there’s no proxy between the client and your internet facing proxy. The client may also spoof any IP, as the client can pass along any X-Forwarded-For header to the request.

Fortunately it’s easy to fix this by just replacing the processor. The GetIpFromHeader method just needs a few small changes:

protected virtual string GetIpFromHeader(string header)
{
    Assert.ArgumentNotNull(header, nameof(header));
    var ipAddresses = header.Split(',');
    int headerIpIndex = HeaderIpIndex;
    // "ipAddresses.Length - headerIpIndex" as index instead of just "headerIpIndex"
    string ipString = headerIpIndex < ipAddresses.Length ? ipAddresses[ipAddresses.Length - headerIpIndex - 1] : ipAddresses.LastOrDefault();
    if (string.IsNullOrWhiteSpace(ipString))
        return null;
    ...

As Grant Killian points out in his post on this, some reverse proxies adds port number to the IP address. If so, the port number needs to be filtered out. It’s also worth noticing that the list of IP addresses may also contain IPv6 addresses. Those use colon (:) instead of dots (.) as group separators. In addition, IPv6 addresses can also be shortened by omitting groups with zeros. Se we can’t really strip a port number in a reliable way for IPv6.

Continuing the above method, could look something like this:

private static readonly Regex IPv4WithPort = new Regex(@"^([\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}):([\d]+)$", RegexOptions.Compiled);
private static readonly Regex IPv6WithPort = new Regex(@"^\[([0-9a-fA-F:]+)\]:([\d]+)$", RegexOptions.Compiled);
    ...
    ipString = ipString.Trim();
    if (ipString.IndexOf(':') >= 0) // Quick test for performance
    {
        // Is this in the format 0.0.0.0:0 ?
        var match = IPv4WithPort.Match(ipString);
        if (match.Success)
        {
            return match.Groups[1].Value;
        }
        // Is this in the format [2001:db8:85a3::8a2e:370:7334]:0
        // As IPv6 formats may be shortened, so must be escaped (RFC 3986, sec 3.2.2)
        match = IPv6WithPort.Match(ipString);
        if (match.Success)
        {
            return match.Groups[1].Value;
        }
    }
    return ipString;
}

Leave a Reply