Discussion:
Slow download of files with EZproxy
Wiktor Rzeczkowski
2014-09-24 03:36:49 UTC
Permalink
From my experience and investigation of complaints of "slowness", "download stops", "download does not complete", "web page does not load", etc. in EZproxy and other online services, the problems can appear when, paradoxically, good stateful firewall rules involving conntrack and RELATED/ESTABLISHED are enabled, and they disappear when the rules are disabled or somewhat mitigated.
I saw defective TCP ACK packets (e.g. packets with wrong SLE/SRE, cf. with Wireshark) coming to servers and being rejected (correctly) by the good conntrack based rules, causing TCP transmissions to fail, which the users perceived as the "slowness", "download stops", etc.

The problem was primarily for users who used slower connections to the Internet, such as wireless and DSL connections, and who were far from the content server. In such circumstances some network packets could likely be lost on the way and, when they did, retransmissions were attempted which would never complete if bad TCP ACK packets were generated on the way and the firewall on the server rejected them.

For http (port 80) and https (port 443) services, such as those of EZproxy, the stateful conntrack rules could be mitigated with additional stateless rules, e.g.:

-A INPUT -p tcp -m tcp --dport 80 -m conntrack --ctstate NEW -j ACCEPT
-A INPUT -p tcp -m tcp --dport 443 -m conntrack --ctstate NEW -j ACCEPT
-A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 443 -j ACCEPT

Here, bad packets rejected by the stateful conntrack rules (the first two) could be accepted by the following stateless rules so that the required retransmissions can complete. When such stateless rules were added to firewalls on our servers, users stopped complaining.

I wonder if the Ubuntu did have stateful firewall rules and the Solaris did not.

Wiktor

---
Wiktor Rzeczkowski
McMaster University
---
You are currently subscribed to ezproxy as: gee-***@m.gmane.org.
To unsubscribe, send request to ***@itec.suny.edu
John Benedetto
2014-09-26 19:47:35 UTC
Permalink
That -sounds- like a neat explanation, Wiktor, but can you break it down further for those of us that don't have your technical expertise?


-----Original Message-----
From: Wiktor Rzeczkowski [mailto:***@mcmaster.ca]
Sent: Tuesday, September 23, 2014 9:37 PM
To: EZProxy discussion list
Subject: Re:[ezproxy] Slow download of files with EZproxy

From my experience and investigation of complaints of "slowness", "download stops", "download does not complete", "web page does not load", etc. in EZproxy and other online services, the problems can appear when, paradoxically, good stateful firewall rules involving conntrack and RELATED/ESTABLISHED are enabled, and they disappear when the rules are disabled or somewhat mitigated.

I saw defective TCP ACK packets (e.g. packets with wrong SLE/SRE, cf. with Wireshark) coming to servers and being rejected (correctly) by the good conntrack based rules, causing TCP transmissions to fail, which the users perceived as the "slowness", "download stops", etc.

The problem was primarily for users who used slower connections to the Internet, such as wireless and DSL connections, and who were far from the content server. In such circumstances some network packets could likely be lost on the way and, when they did, retransmissions were attempted which would never complete if bad TCP ACK packets were generated on the way and the firewall on the server rejected them.

For http (port 80) and https (port 443) services, such as those of EZproxy, the stateful conntrack rules could be mitigated with additional stateless rules, e.g.:

-A INPUT -p tcp -m tcp --dport 80 -m conntrack --ctstate NEW -j ACCEPT -A INPUT -p tcp -m tcp --dport 443 -m conntrack --ctstate NEW -j ACCEPT -A INPUT -p tcp -m tcp --dport 80 -j ACCEPT -A INPUT -p tcp -m tcp --dport 443 -j ACCEPT

Here, bad packets rejected by the stateful conntrack rules (the first two) could be accepted by the following stateless rules so that the required retransmissions can complete. When such stateless rules were added to firewalls on our servers, users stopped complaining.

I wonder if the Ubuntu did have stateful firewall rules and the Solaris did not.

Wiktor

---
Wiktor Rzeczkowski
McMaster University
---
You are currently subscribed to ezproxy as: ***@unm.edu.
To unsubscribe, send request to ***@itec.suny.edu

---
You are currently subscribed to ezproxy as: gee-***@m.gmane.org.
To unsubscribe, send request to ***@itec.suny.e
Andrew Anderson
2014-09-27 12:57:23 UTC
Permalink
What is being proposed is that sites stop using stateful firewall rules on their proxy servers (http://en.wikipedia.org/wiki/Stateful_firewall).

What is interesting to me is that sites reporting this as a solution all seem to be running Ubuntu for their proxy servers. I’d like to know more about what version of Ubuntu and which kernel version they are running where this solves their issue. I have not witnessed this problem on Red Hat/CentOS distributions, and I don’t know if it’s because Ubuntu is running a different version of the kernel that I need to watch out for, or if there is something else in play (which I feel is more likely).

While I cannot refute their direct evidence that removing the stateful rules addressed their immediate problems, it seems odd that it would be the root cause of the problem. In past jobs, I have seen mangled packets coming from poor quality wireless/mobile networks, and it is possible that this could be the origin of the defective packets that were captured by those who have experienced this issue as well. In this instance wireless and DSL users tended to be affected, so that supports the possibility of the wireless network being of poor signal quality, and DSL users can be impacted by low quality DSL hardware that the phone companies like to deploy as routers — especially their wireless routers. A friend’s business was having similar issues recently until I went in — /after/ the phone company technician — and performed the built-in “look for a new firmware version” function on their router. After the new firmware was applied, all the problems they were experiencing went away. Why the phone company technician did not do that simple step, I do not know.

The act of removing the stateful rules also introduces the ability for attackers to send invalid network packets to the proxy server that are not associated with any existing connection, and opens the proxy to certain DDoS attack vectors that would not otherwise exist, so I would not recommend that everyone do this.

As to the question regarding the stateful firewall handling in Ubuntu vs Solaris, the last time I worked with Solaris, yes, the firewall implementation in Solaris was not stateful, but a simple filter function.
--
Andrew Anderson, Director of Development, Library and Information Resources Network, Inc.
http://www.lirn.net/ | http://www.twitter.com/LIRNnotes | http://www.facebook.com/LIRNnotes
Post by John Benedetto
That -sounds- like a neat explanation, Wiktor, but can you break it down further for those of us that don't have your technical expertise?
-----Original Message-----
Sent: Tuesday, September 23, 2014 9:37 PM
To: EZProxy discussion list
Subject: Re:[ezproxy] Slow download of files with EZproxy
From my experience and investigation of complaints of "slowness", "download stops", "download does not complete", "web page does not load", etc. in EZproxy and other online services, the problems can appear when, paradoxically, good stateful firewall rules involving conntrack and RELATED/ESTABLISHED are enabled, and they disappear when the rules are disabled or somewhat mitigated.
I saw defective TCP ACK packets (e.g. packets with wrong SLE/SRE, cf. with Wireshark) coming to servers and being rejected (correctly) by the good conntrack based rules, causing TCP transmissions to fail, which the users perceived as the "slowness", "download stops", etc.
The problem was primarily for users who used slower connections to the Internet, such as wireless and DSL connections, and who were far from the content server. In such circumstances some network packets could likely be lost on the way and, when they did, retransmissions were attempted which would never complete if bad TCP ACK packets were generated on the way and the firewall on the server rejected them.
-A INPUT -p tcp -m tcp --dport 80 -m conntrack --ctstate NEW -j ACCEPT -A INPUT -p tcp -m tcp --dport 443 -m conntrack --ctstate NEW -j ACCEPT -A INPUT -p tcp -m tcp --dport 80 -j ACCEPT -A INPUT -p tcp -m tcp --dport 443 -j ACCEPT
Here, bad packets rejected by the stateful conntrack rules (the first two) could be accepted by the following stateless rules so that the required retransmissions can complete. When such stateless rules were added to firewalls on our servers, users stopped complaining.
I wonder if the Ubuntu did have stateful firewall rules and the Solaris did not.
Wiktor
---
Wiktor Rzeczkowski
McMaster University
---
---
---
You are currently subscribed to ezproxy as: gee-***@m.gmane.org.
To unsubscribe, send request to ***@itec.suny.edu
Gorman, Jon
2014-09-27 21:05:45 UTC
Permalink
One thing we've seen is servers that were woefully under-provisioned having issues w/ stateful rules filling up memory with the ip conntrack proc "file". I suspect it wasn't an intentional attack, but some sort of automated spider/script that was just badly designed. Boosting memory took care of the problem.

We also added an alert to monitor /proc/net/ip_conntrack and warn/block ip addresses that suddenly spawned a lot of sessions w/ a wonky connection.


Jon Gorman
University of Illinois


---
You are currently subscribed to ezproxy as: gee-***@m.gmane.org.
To unsubscribe, send request to ***@itec.suny.edu
Wiktor Rzeczkowski
2014-09-29 13:55:58 UTC
Permalink
The explanation could broken down further as follows (unfortunately, with more technicalities).

I've seen that TCP transmission of a file from a http server, like EZproxy, was failing when
- the file required multiple TCP packets to be sent, each with part of the content (maximum size of a TCP packet is typically about 1500 bytes)
- a TCP packet was lost, i.e. did not arrive to the user
- a DEFECTIVE TCP ACK packet (i.e. a defective request for retransmission) arrived to the server and was rejected

When a user received a data packet that was not the subsequent packet expected, it was notifying the server of what was so far actually received by sending a TCP ACK packet which included specification of ranges of bytes already received, i.e. the SLE/SREs - left/right byte numbers of the ranges. In the failed transmissions that I saw, the user was sending ACK packets with correct SLE/SREs while the server was receiving them with wrong SLE/SREs. A stateful firewall on the server was not accepting the ACK packets when the byte ranges were for bytes that were not sent by the server and yet claimed (in the ACKs) to have been received. When an ACK packet was rejected by the firewall the server was not aware that a data packet was lost and was not resending it. On the user end the file was then
never fully assembled.


Wiktor
---
Wiktor Rzeczkowski
McMaster University
---
You are currently subscribed to ezproxy as: gee-***@m.gmane.org.
To unsubscribe, send request to ***@itec.suny.edu
Wiktor Rzeczkowski
2014-09-29 14:15:43 UTC
Permalink
Jon, under-provisioning was suggested as cause but I did not find any signs of it and, instead, I determined that wrong ACK packets were the cause.

Wiktor
---
Wiktor Rzeczkowski
McMaster University
---
You are currently subscribed to ezproxy as: gee-***@m.gmane.org.
To unsubscribe, send request to ***@itec.suny.edu
Wiktor Rzeczkowski
2014-09-29 14:37:28 UTC
Permalink
On this list a reply comment does not seem to get associated with the original comment. The following is for Andrew's of Sat, 27 Sep 2014 08:57:23 -0400.

The intention was to show a possibly relevant experience but, yes, it may indicate that sometimes stateful firewall rules are not good to have. Sites may prefer to stop using them when users are not able to get content and when the cost implied by the stopping is less than the cost of users not getting the content.

The cost of stopping the use of stateful firewall rules is a diminished firewall protection, i.e. the cost of having a transmission's session history no longer considered when determining if the transmission should continue. It seems that many would argue that in many situations the cost associated with the diminished protection is negligible and far smaller than the cost associated with the otherwise bad user experience.

Unfortunately, given the nature of the cause of the problem "slowness" (cf. other posts in this thread), all who employ good stateful firewall rules, including RedHat, seem to be affected.

In fact, the known complaints of our users were about services (e.g. EZproxy) that were running on RedHat/CentOS, not about those on Ubuntu, and to many the RedHat/CentOS platforms, not Ubuntu, were the suspects. To verify that Ubuntu was not affected I conducted my investigation also on Ubuntu but have seen the same problem ACK rejections by stateful firewall rules there as on RedHat/CentOS. Firewalls with stateful rules implemented by both the "conntrack" and the "state" firewall modules were rejecting the wrong ACKs and were breaking TCP transmissions.


It seems that older network equipment (e.g. older layer 3 routers) or software, or newer (more complicated) network equipment (e.g. newer layer 3/7 content switches) or software may be replacing good SLE/SRE numbers in the ACK retransmission requests with bad numbers and eventually causing the firewall rejections and breaking of TCP transmissions. Those in the former category may just be not supporting the SLE/SRE part of the TCP transmission protocol because the SLE/SREs are too new in the TCP protocol. Those in the latter category may be having difficulty to support the SLE/SREs because they themselves are too new.

If network equipment and software on campuses, in ISPs, and in ISPs of ISPs could be checked then the stateless to use or not to use question could possibly be gone.


Wiktor
---
Wiktor Rzeczkowski
McMaster University
---
You are currently subscribed to ezproxy as: gee-***@m.gmane.org.
To unsubscribe, send request to ***@itec.suny.edu

Loading...