Proxy issues with querystrings in path names
You have seen this before: /path/to/something.js?v=2, or maybe it used a date or a version control id or some such. The notion of putting the version into the URL so you can aggressively cache and yet quickly push new versions.
There has long been issues with using the querystring as the version. At some point I seem to remember Safari not going a good job caching that scenario and thinking that it was different.
Steve “Neo” Souders has posted about this issue especially as it relates to proxy servers and default configurations:
There’s a section in my book called Revving Filenames. It contains an example of adding a version number to the filename. That’s prompted several emails where people have asked me about tradeoffs around using a querystring versus embedding something in the filename. I wasn’t aware of any performance difference, but in a meeting this week a co-worker, Jacob Hoffman-Andrews, mentioned that Squid, a popular proxy, doesn’t cache resources with a querystring. This hurts performance when multiple users behind a proxy cache request the same file - rather than using the cached version everybody would have to send a request to the origin server.
I tested this by creating two resources,
mylogo.1.2.gifandmylogo.gif?v=1.2. Both have a far future Expires date. I configured my browser to go through a Squid proxy. I made one request tomylogo.1.2.gif, cleared my cache (to simulate another user making the request), and fetchedmylogo.1.2.gifagain. This produces the following HTTP headers:>> GET http://stevesouders.com/mylogo.1.2.gif HTTP/1.1 << HTTP/1.0 200 OK << Date: Sat, 23 Aug 2008 00:17:22 GMT << Expires: Tue, 21 Aug 2018 00:17:22 GMT << X-Cache: MISS from someserver.com << X-Cache-Lookup: MISS from someserver.com >> GET http://stevesouders.com/mylogo.1.2.gif HTTP/1.1 << HTTP/1.0 200 OK << Date: Sat, 23 Aug 2008 00:17:22 GMT << Expires: Tue, 21 Aug 2018 00:17:22 GMT << X-Cache: HIT from someserver.com << X-Cache-Lookup: HIT from someserver.comNotice that the second response shows a HIT in the X-Cache and X-Cache-Lookup headers. This shows it was served by the Squid proxy. More evidence of this is the fact that the Date and Expires response headers have the same values, even though I made these requests 10 seconds apart. For conclusive evidence, only one hit shows up in the stevesouders.com access log.
Loading
mylogo.gif?v=1.2twice (clearing the cache in between) results in these headers:>> GET http://stevesouders.com/mylogo.gif?v=1.2 HTTP/1.1 << HTTP/1.0 200 OK << Date: Sat, 23 Aug 2008 00:19:34 GMT << Expires: Tue, 21 Aug 2018 00:19:34 GMT << X-Cache: MISS from someserver.com << X-Cache-Lookup: MISS from someserver.com >> GET http://stevesouders.com/mylogo.gif?v=1.2 HTTP/1.1 << HTTP/1.0 200 OK << Date: Sat, 23 Aug 2008 00:19:47 GMT << Expires: Tue, 21 Aug 2018 00:19:47 GMT << X-Cache: MISS from someserver.com << X-Cache-Lookup: MISS from someserver.comHere it’s clear the second response was not served by the proxy: the caching response headers say MISS, the Date and Expires values change, and tailing the stevesouders.com access log shows two hits.
Proxy administrators can change the configuration to support caching resources with a querystring, when the caching headers indicate that is appropriate. But the default configuration is what web developers should expect to encounter most frequently. Another interesting note about these tests: notice how the proxy downgrades the responses to HTTP/1.0. This is going to alter browser behavior in terms of the number of connections that are opened. When I’m doing performance analysis I make sure to avoid being connected through a proxy.





