I know, I said I was going to bed… and I almost did… but it’s hard for a geek in a coding frenzy to just ‘let go’. I was only going to browse one or two other possible config issues on the tablet as the lights were out.
Then I ran into a reference to using “squid” with “tor” (and “privoxy” with “tor”… for another posting later…) as “chained proxies”. Why would you ever want to do that? I pondered… And at that moment I knew “Curiosity Compelled The Geek” was in the wind.
Well, it’s about an hour later. I’ve installed “squid” and I’m liking it a whole lot.
What’s squid? It is a “caching proxy”. You load a web page or ad or whatever once, and it puts it in a nice fat cache. Next time you access it, it comes from the cache. Seems this sucker is used all over the internet to lighten load on web servers (so the squid takes a surge of load locally instead of funneling it all back to the main server. Sort of like an Akamai light).
Making the most of your Internet Connection
Squid is used by hundreds of Internet Providers world-wide to provide their users with the best possible web access. Squid optimises the data flow between client and server to improve performance and caches frequently-used content to save bandwidth. Squid can also route content requests to servers in a wide variety of ways to build cache server hierarchies which optimise network throughput.
Website Content Acceleration and Distribution
Thousands of web-sites around the Internet use Squid to drastically increase their content delivery. Squid can reduce your server load and improve delivery speeds to clients. Squid can also be used to deliver content from around the world – copying only the content being used, rather than inefficiently copying everything. Finally, Squid’s advanced content routing configuration allows you to build content clusters to route and load balance requests via a variety of web servers.
” [The Squid systems] are currently running at a hit-rate of approximately 75%, effectively quadrupling the capacity of the Apache servers behind them. This is particularly noticeable when a large surge of traffic arrives directed to a particular page via a web link from another site, as the caching efficiency for that page will be nearly 100%. ” – Wikimedia Deployment Information.
75% thinks I… no way…
So I did a simple test. Shut off anything that blocks ads or does other load lightening stuff. Hit WUWT and do the 1 Mississippi 2 Mississippi… 80 Mississippi and it’s done.
Turn on the proxy connection. Hit it again, another 80 count (as the squid cache loads). Hit it again… 22 Mississippi… and it is done. Yup, about 75% less bandwidth used.
Now I’ve not tested on a lot of stuff, but this was a darned quick and easy “wow” factor.
Hitting Tallblokes was 24 and 8. About a 66% lighter load.
Now I don’t know how much it will help in general browsing as you don’t usually hit ‘reload’ on sites. BUT I often revisit a site or web page a couple of times, and then ads and other “junk” is often very repetitious. I expect the speedup to rise over time as the cache builds, for general site visits. (Privoxy is a proxy that does a whole lot of ad removal, banner removal, dancing animated gif removal, etc. After I get it installed and tested I’ll report on it, too. I’m suspecting maybe chaining the two for a mix of removal and cache might be interesting. Then again, maybe it won’t do much… We’ll see.)
But the first thing I thought of was P.G. on the end of his slow link. And John on his boat… If any would benefit from a large local cache, it would be folks with a slow link.
How hard was it? Darned easy. I installed it on the CentOS box (probably the hardest one to do, as CentOS is picky about things and often not in sync on release cycle with the current trendy stuff). It was basically just install the package, change one line in the default config file, and turn on the service. Then point the browser proxy at port 3128. The site says they have versions for Windows too. I’m going to try it on the XP box “some other day” (unless folks need a ‘how to’ sooner).
There’s a long list of binary versions available here:
As this is a proxy meant to server many clients, it is usually done on a server, so one could also put squid on a R.Pi doing routing and then just have one place all your systems can point. I’d have done that but I had the router card out of the Pi right now and this box was easiest to get up quick and ‘try it’.
So what did I do?
yum install squid
# Uncomment and adjust the following to add a disk cache directory. cache_dir ufs /var/spool/squid 100 16 256
service squid start
The first time I tried to start the service, it would not start. Taking the # from in front of the cache line allowed it to start. (Hard to start with no place to put the cache, I guess). The numbers are size, top directories, second directories. So 100 MB of disk for the cache size. I’ll likely make all of those larger over time. Maybe.
Then in Firefox, under Edit : Preference : Advanced : Network : Connection (Settings)click the ‘manual proxy’ radio button, put 127.0.0.1 in the http proxy name (since we are using the local host), put 3128 in the port number, and check the ‘use this proxy for all’ box, OK.
More trouble to set the proxy in Firefox than to get the service up, in some ways.
For non-Fedora non-CentOS boxes (like the pi) I’d expect that package manager line instead of being the yum command to be something like “apt-get install squid”. That, too, will wait for tomorrow and the R.Pi test. As for Windows, I expect it’s the usual unzip the .exe process.
The one “negative” I’ve run into is that loading the “Tech Rebound – The Musical” page has black squares where the youtube images ought to be. I suspect there is a config setting to do something special with them. Just be advised that it looks like it blocks youtube by default and one would need to either point away from the proxy, or perhaps change the config file, to have them visible in pages. For P.G. that’s likely a ‘feature’ ;-0
As it stands, I’m very very happy with squid. Anything that cuts repeat loads of things by 2/3 to 3/4 is a nice thing to have around. I often hit the same web sites every day, and I’m sure a lot of the ads, widgets, and ‘whatever’ like images and banners are constant. Just doing this posting things are “snappier” as many of the parts of the page are cached, both on the writing /editing page and on the preview page.
I’m going to take privoxy for a test drive in the next few days too, while getting TOR to work likely using the “other” config listings I’ve gotten. Then I’ll try chaining different combinations to see how they do. Several sites recommended different combos of squid-tor and privoxy-tor (depending on your goal) but the squid site suggested using privoxy for privacy instead of squid, so chaining them might not do much; as one removes the stuff – ads, banners, graphics – that the other one caches.
It is highly likely that I’ll leave squid installed on several of my machines, even if I have a R.Pi squid-tor server running. Just because sometimes I tear down systems to play with the parts and it would be nice to have it there just one swap of proxy settings away…
So “happy hacking” and if you try it, please let us know what you think of any experience differences you have (good or bad). It will be a while before I can give it a real ‘shakedown’ as I really really am going to bed “real soon now” ;-) and a broader test base would help others decide to try it, or not.