Web Caching server

Hi all,

Need some recommendations on what to do here.

I’m somewhat a network administrator in my job on a school setting. My problem is making the free public wifi for the students to have a smoother experience cause the bandwidth is limited and they many.

My thinking is to setup a local caching server for web traffic. I came upon on my research, LAN Cache (for steam games and windows update), Squid, Nginx reverse proxy, and this Cachebox by ApplianSys.

I want to implement this Cachebox by ApplianSys but with Free and Open Source system I think it is possible. The setup in our school has an Active Directory and a good Firewall. I think Squid can do this but what do you think is the best possible system to implement? Without that much hassle as to reconfigure the AD/DNS to Squid. because we some inhouse website that we are hosting. I don’t know if this transparent proxy the thing to implement.

What can you say?

Since most traffic now is https there’s little you can do about it unless you force users to accept your own certificate which would act as a MITM decryption proxy which probably isn’t going to go down well. Proxies in general are quite useless except for DNS (to some extent) however I would first try to identify the traffic before assuming it’s X, Y and Z.

4 Likes

If you haven’t already I would consider looking into traffic shaping to possibly help this as apposed to proxies. As dizzy said with most traffic being encrypted these days it hard to get proxies to function in useful manner.

2 Likes

Identify the traffic you mean like video, general web browsing, or so on and so forth? Being it’s a school premise we allow website such as YouTube and other leisure websites (social media) and of course for research and study purposes.

Traffic shaping? Is it on the Firewall level? What technology can you suggest. I have some Cisco training but not certified. We use HP products.

It’s a feature of every business class firewall I have seen, and some routers as well. You may be more familiar with the terms QoS or bandwidth limiting which are both types of traffic shaping. Basically the idea of QoS is to prioritize delay sensitive traffic (eg VOIP, video conferencing, gaming) over less important traffic (eg emails, file downloads, etc). The idea of bandwidth limiting is just limiting the max connection speed of any given host. QoS in particular can get quite complicated, but if you can tell us what kind of Firewall you are using some one on here can probably get you pointed in the right direction for some resources to look at.

QoS is already enabled in our firewall. The issue I’m tackling is the sheer amount of user. like >2k devices. I’m about to read this CDN thing but I think this is just a reverse proxy implementation as well.

Yeah modern web is not cache friendly.

For a cheap but powerful qos solution that often beats dedicated firewalls (depends on which specific one you have of course!) you can install the x86 image of openwrt - I shape a 500mbit symmetric fiber with a 7100t Intel, with “cake”/“piece of cake” and never been over 20% CPU & happier - it’s pretty much zero configuration and maintenance.

A school setting is one of the few places where could reasonably MITM everything (with parental consent in order to both effectively filter bad traffic and improve performance) - if you own (or can enroll) the client machines, push your own root CA cert as trusted on the workstations, and man in the middle/cache the HTTPS to your heart’s content.

You’d want to ensure that you have an acceptable use policy in place though and exclude staff machines from it (who may be doing internet banking, etc. or other things that you don’t wan’t to be liable for sniffing.

But yes, the modern web in general is not cache friendly. Your real solution is to get more bandwidth unfortunately. In addition to being enrypted the modern web is also very dynamic and unless you mess with caching parameters to aggressively cache things that maybe don’t want to be cached you’re likely to only make 10-15 percent gains if that.

Then again, it’s free public wifi. Such things are best effort, if it’s slow…. sorry, i guess?

1 Like

yeah but we have this ISO that I always need to answer and it’s very irritating. Instead of doing something significant I waste my time with this paper works. I know it’s never ending, someone somewhere will still always complain. But I’m thinking on this like to gain knowledge and experience even though the pay is not worth it.

First fix the network.
Check out Bufferbloat.net - Bufferbloat.net
If you are able to put openwrt somewhere between the users and the internet you can use it’s SQM stuff to really improve things
https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm
The hardware required to do so will depend on how much upstream bandwidth you have. An old MIPS based router can handle SQM up to maybe 80mbit, more than that you will need a multi-core arm device. Any recent x86 box could handle gigabit SQM and openwrt will run on x86. Probably there are other options that do “fq-codel”, “cake”, “smart queue management”, etc, just search for those terms. Some commercial vendors are starting to add these things too, reply with what you are using.
But you don’t want old school QoS or CoS, too static and hard to deal with, and leaves too much bandwidth unused.

After that you could still consider the web proxy or some other ideas:

  • implement ad blocking and privacy protection at the network level with something like pi-hole. A good idea anyway and it reduces bandwidth usage, makes webpages load faster, etc
  • look at what things are consuming the most bandwidth and focus on those. Maybe it’s Windows updates, linux package updates and you can setup a local cache for those? Maybe it’s netflix and you don’t want them using that anyway and can disallow it, etc.
    Good luck!
1 Like

Ah yes, an internet search for “cake”… No doubt you’ll find plenty of options!

Thank you greatly for linking this, I implemented it and saw a pretty big optimization with the distribution of frames.

Still got a C ranking but the distribution is now more consistent and tightly grouped. Very pleased with the results!

Before SPM

After SPM