# Filtering https with Squid, but only at a basic level?



## mariourk (Nov 2, 2015)

Hi,

I want to filter https traffic with Squid, but only on a very basic level. I'm not interrested in the actual content. I don't want Squid to act as a man-in-the-middle and I don't want Squid to decrypt anything. All I want is to block certain sites for certain people and that's it. Someone in department X is allowed to visit ebay.com, while someone at deparment Y is not. That's it. And is someone at department X is browsing all kinds of adult content on ebay.com, I simply don't care.

The problem is that tutorials about filtering https delve much deeper than just that. They explain how to create SSL-certificates, making all hosts trust this certificate, having Squid handling the https-connection, decrypt everything, filter it, decrypt it again and send it to the browser. In other words, it's a hassle and far more than I need.

It should be possible to keep it much simpler, if I don't need all that. Right?


----------



## tingo (Nov 2, 2015)

And squid's FAQ doesn't help?
http://wiki.squid-cache.org/SquidFaq/SquidAcl#How_can_I_block_access_to_porn_sites.3F


----------



## mariourk (Nov 2, 2015)

That is what I've been doing for years. But as far as I know, that only works for http.


----------



## sidetone (Nov 2, 2015)

There's a book, "Squid Proxy Server 3.1: Beginner's Guide," by packt publishing. All I know about it, is what's in the book's description, and it's one of the few books on the topic. It's just a mention. If you can find your information for free, go for it. I'm not stopping the conversation, maybe someone could know and answer.


----------



## usdmatt (Nov 2, 2015)

> All I want is to block certain sites for certain people and that's it. Someone in department X is allowed to visit ebay.com, while someone at deparment Y is not


The problem is, the entire HTTP request is encrypted, so you've no idea what website people are going to unless you can decrypt the data. The only way to decrypt the data is to go through the hell of the MITM style config, which is probably why all the filtering HTTPS guides go down that route. There really isn't any other way of seeing what the traffic is doing.

The closest you will get it to block the IP ranges of the websites. But of course that isn't foolproof as some will be hosted in multiple locations and may change addresses from time to time.


----------



## usdmatt (Nov 2, 2015)

The other alternative is to set all the machines to use your own DNS server, then configure that to return the address of some local web server that shows a 'Website blocked' message for domains you want to block. I know with Bind you can set different 'views', so different ranges of client IP addresses will see different DNS records.

Of course that's easily circumvented by anyone who knows what they are doing by changing their DNS servers, although you could block all port 53 traffic to the Internet from anything other than your own DNS server. There's still ways round it but it gets more and more awkward, and if users are using jumping through hoops to get round your policies, it should be pretty clear to them that they're doing something they shouldn't.


----------



## mariourk (Nov 2, 2015)

> The problem is, the entire HTTP request is encrypted, so you've no idea what website people are going to



Digging a little deeper in the whole https thing and it seems you are correct. Which is a bummer. I was kind of hoping the URL's themselves would be unencrypted.

Well, on the other hand, I'm actually glad it works that way. It just sucks that it makes my problem a bit more complicated


----------



## mariourk (Nov 2, 2015)

> There's still ways round it but it gets more and more awkward, and if users are using jumping through hoops to get round your policies, it should be pretty clear to them that they're doing something they shouldn't.



My additude towards those people is usually to (sublty) let them know that I know what they're doing and leave it at that. If they get around it and don't cause actual trouble, good for them. I'm not going to waste my time and energy on that.


----------



## usdmatt (Nov 2, 2015)

> I was kind of hoping the URL's themselves would be unencrypted.


Unfortunately not. HTTPS is designed to protect the entire request, and the URL is just as important as anything else.

This is why we always used to have the issue of SSL websites requiring their own IP address. The URL can contain sensitive data, so needs to be sent after encryption has been set up, but of course the web server doesn't know what website you want until you send the URL. Because of this, the only information the web server could use to know what website you wanted was to look at the IP address you were connecting to.

These days there is SNI (Server Name Indication) which allows the browser to tell the server what website it wants during the initial SSL/TLS handshake. I don't know enough about it to know how this part of the connection is encrypted, but there does seem to be some references on the Squid website to 'peeking' at the SNI name. I can't see much information about if this can actually be used to block addresses but it might be worth looking into.
http://wiki.squid-cache.org/Features/SslPeekAndSplice


----------



## obsigna (Nov 2, 2015)

I agree completely with usdmatt, if you want to do traffic filtering using squid, then you need to set up a transparent TLS proxy, and you really want to use the latest Peek and Slice feature, otherwise, a lot of sides do not work with the proxy. Setting this up is after all not that complicated. For this I needed to add only a few lines more in the squid configuration file. One more line in the firewall configuration, and I re-used a self-signed certificate that I had already created for another purpose.

That said, in your usage-case I would at least consider to do the blocking by the way of a specially configured DNS resolver. For example in the unbound configuration on your gateway, you could add for each domain that you want to block something like following:

```
...
local-zone: "ebay.com" static
local-zone: "playboy.com" static
local-zone: "facebook.com" static
...
```
Unbound would respond with the status code NXDOMIN for DNS requests on the whole zone.

In the firewall you need to add a rule that prevents restricted clients to use outside name servers:

```
...
/sbin/ipfw -q add deny ip from not me to any 25,53 out xmit $WAN
...
```
Of course you need to change the name server settings on the clients to point to your local server. You may want to setup another name server in your network for the non-restricted clients.

Clients could still access some sides by directly entering the IP into the address bar of the browser, however, for large sites ebay, facebook, youtube, etc, this doesn't work as expected anyway.


----------

