In order to keep this blog post a bit more relevant, there have been some improvements since that post was written. Squid v3.2 has been released earlier this year, making ssl interception more seamless and easier. The new features for HTTPS interception can be found while reading through the man page for http_port:
1. The “transparent” keyword has been changed to “intercept“:
intercept Rename of old 'transparent' option to indicate proper functionality.
INTERCEPT is now better described as:
intercept Support for IP-Layer interception of outgoing requests without browser settings. NP: disables authentication and IPv6 on the port.
2. In order to avoid more certificate errors when intercepting HTTPS sites, squid now can dynamically generate SSL certificates, using generate-host-certificates. This means the CN of the certificate should now match that of the origin server, though the certificate will still be generated using SQUID’s private key:
SSL Bump Mode Options: In addition to these options ssl-bump requires TLS/SSL options. generate-host-certificates[=<on|off>] Dynamically create SSL server certificates for the destination hosts of bumped CONNECT requests.When enabled, the cert and key options are used to sign generated certificates. Otherwise generated certificate will be selfsigned. If there is a CA certificate lifetime of the generated certificate equals lifetime of the CA certificate. If generated certificate is selfsigned lifetime is three years. This option is enabled by default when ssl-bump is used. See the ssl-bump option above for more information.
Looks like the above is an offshoot of the excellent work here: http://wiki.squid-cache.org/Features/DynamicSslCert
Make sure to use the above two features for smoother HTTPS interception – though remember, always warn users that SSL traffic is being decrypted, privacy is a highly-valued right…
We recently had a scenario where an apache reverse proxy needed to be deployed in front of a pair of tomcat servers. Due to security concerns, this reverse proxy was hosting mod_security and acting as a web application firewall (WAF)
However, a critical requirement was that the tomcat applications would be able to see the original IP address of the client. This presented a problem because unlike squid, apache has no configurable option to act as a fully transparent proxy. In other words, once traffic was redirected through the apache reverse proxy, the traffic forwarded to the tomcat server was forwarded with it’s source IP address changed to the proxy, effectively hiding the public IP the client used to connect to the site.
The first solution that sprang to mind was the “X-Forwarded-For” headers, which is an HTTP header inserted into the original HTTP GET request whose value is equal to the client’s public IP. Turns out apache reverse proxy inserts this header by default, and even so the tomcat application could not extract the client’s IP. We somehow needed to instruct the tomcat server itself to provide the application with the correct client IP.
The solution that worked in my case was the RemoteIP tomcat valve. Official documentation lives here:
It’s quite simple to configure in that all that needs to be done is to modify tomcat server.xml to recognise original client IP rather than the proxy IP by adding the following to server.xml:
make sure to change 127.0.0.1 to the address of the apache reverse proxy.
The application could now recognise the original client IP.
PS as per the tomcat documentation, the apache equivalent of the above method is using the mod_remoteip
To dis-allow users from connecting to a site via IP rather than URL name (so bypassing filtering unless you use the time consuming forward / reverse lookup feature), uncomment the following line in the bannedsitelist:
To enable syslog, the default dansguardian.conf uses:
# Syslog logging
# Use syslog for access logging instead of logging to the file
# at the defined or built-in “loglocation”
#syslog = on
The line “syslog = on” is incorrect and should be changed to:
logsyslog = on
The facility and priority used by dansguardian is:
In order for danguardian to display the category when blocking a site, insert the following line at the beginning of each domain blacklist file:
A quick script to do insert the above mentioned line into each enabled blacklist (note: be careful, these statements are all one-liners):
categories=`cat /usr/local/etc/dansguardian/lists/bannedsitelist | grep -v “#” | grep “Include” | cut -d “/” -f 8`
for category in $categories
echo ‘#listcategory: “‘$category’”‘ > /usr/local/etc/dansguardian/lists/blacklists/$category/domain.new
cat /usr/local/etc/dansguardian/lists/blacklists/$category/domains | grep -v “#” >> /usr/local/etc/dansguardian/lists/blacklists/$category/domain.new
rm -f /usr/local/etc/dansguardian/lists/blacklists/$category/domains
mv /usr/local/etc/dansguardian/lists/blacklists/$category/domain.new /usr/local/etc/dansguardian/lists/blacklists/$category/domains
In order to modify the blocked page displayed, change the following file:
When troubleshooting website issue (such as parts of the website not loading, infinite redirect loops, and so on) the web debugger tools Fiddler2 comes in handy. This is especially so when troubleshooting HTTPS issues. Wireshark is a bit difficult to use when troubleshooting encrypted sessions because unless you are given the private keys from the server, wireshark cannot decrypt the traffic leaving you blind for the most part. During Fiddler’s installation you are given the option to put a certificate into the PC’s Trusted Certificates, effectively turning Fiddler into a man-in-the-middle proxy that can decrypt the sessions. This article makes a note of some of the features of the program and how to tackle troubleshooting using this program.
1. I’m getting swamped with all the information that’s being captured by Fiddler. How do we trim this information down to what’s really necessary.
One of the big advantages of fiddler is that it is an independent program, which means it has the capability to intercept any program you want, it’s not limited to a single browser (like HttpFox for example).
This advantage can rapidly turn into a disadvantage if too many programs are accessing the internet, leaving you with too much information to parse. Fiddler has several “levels” at which you can filter data. The first I use is the process filter. On the bottom left of the screen you will see an icon showing “capturing”
Click on this to turn capturing off. You can click on the “all process” to change the subset of process you’d like to capture. I personally prefer to use the “process filter” button on the top menu. If you click and drag this icon over the browser you are using for testing, fiddler will capture traffic coming only from this process
If you need further, more granular filtering, then you can use the filters tab on the right hand side. You can see several options to show only certain hosts, responses, sizes, and so on:
2. Can I modify the HTTP requests / responses as they pass by?
Definitely. There are various ways… the ways I use depend on my needs. The filter tab I show above has the ability to delete or set any request / response headers. This is perfect if you know exactly which header you need to experiment with.
You can also use the “request builder” tab. This is really useful if you have already captured some sample traffic and would like to change or fuzz it to see how the client/server responds. You simply need to click and drag the captured request, and you’re free to change it at will, and then just hit the “execute” button to send to the server:
As a side note, you can also easily change the user-agent that is sent to the server from the “rules > user-agent” menu
3. How can we tell which part of the page is taking so long to load?
Use the “timeline” tab. First, highlight the requests in the left hand pane that you’d like to monitor, then you should see how long each element takes to load:
4. Can we capture traffic generated by hosts where fiddler is not installed?
Yes. fiddler can be used as a fully-fledged proxy, and you can allow remote PCs to use it. So you can setup a testlab where only one machine has fiddler installed, while the other machines simply point to fiddler in their proxy settings. This is especially useful if you need to debug linux machines, since fiddler is a windows only program
Tools > fiddler options > connections > “Allow remote users to connect”
You can confirm this works by running the following in a CMD:
5. Miscellaneous Tips
- You’ll spend most of your time living in the “Inspectors” tab. I usually enable the “auto decode” option so the inspectors will show the captured data in cleartext:
- When testing, it’s always a good idea to clear your cache and cookies before going into another test iteration:
you can also press CTRL while pressing the above to clear cookies
There seems to be a bit of confusion about configuring SQUID to transparently intercept SSL (read: HTTPS) connections. Some sites say it’s plain not possible:
Recent development in SQUID features have made this possible. This article explores how to set this up at a basic level. The SQUID proxy will basically act as a man in the middle. The motivation behind setting this up is to decrypt HTTPS connections to apply content filtering and so on.
There are some concerns that transparently intercepting HTTPS traffic is unethical and can cause legality issues. True, and I agree that monitoring HTTPS connections without properly and explicitly notifying the user is bad but we can use technical means to ensure that the user is properly notified and even gets prompted to accept monitoring or back out. More on this towards the end of the article
So, on to the technical details of setting the proxy up. First, install the dependencies . We will need to compile SQUID from scratch since by default it’s not compiled using the necessary switches. I recommend downloading the latest 3.1 version, especially if you want to notify users about the monitoring. In ubuntu:
apt-get install build-essential libssl-dev
Note : for CentOS users, use openssl-devel rather than libssl-dev
Build-essentials downloads the compilers while libssl downloads SSL libraries that enable SQUID to intercept the encrypted traffic. This package (libssl) is needed during compilation. Without it, when running make you will see the errors similar to the following in the console:
error: ‘SSL’ was not declared in this scope
Download and extract the SQUID source code from their site. Next, configure, compile and install the source code using:
./configure –enable-icap-client –enable-ssl
Note the switches I included in the configure command:
* enable-icap-client : we’ll need this to use ICAP to provide a notification page to clients that they are being monitored.
* enable-ssl : this is a prerequisite for SslBump, which squid uses to intercept SSL traffic transparently
Once SQUID has been installed, a very important step is to create the certificate that SQUID will present to the end client. In a test environment, you can easily create a self-signed certificate using OpenSSL by using the following:
This will of course cause the client browser to display an error:
In an enterprise environment you’ll probably want to generate the certificate using a CA that the clients already trust. For example, you could generate the certificate using microsoft’s CA and use certificate auto-enrolment to push the certificate out to all the clients in your domain.
Onto the actual SQUID configuration. Edit the /etc/squid.conf file to show the following:
always_direct allow all
ssl_bump allow all
http_port 18.104.22.168:3128 transparent
#the below should be placed on a single line
https_port 22.214.171.124:3129 transparent ssl-bump cert=/etc/squid/ssl_cert/www.sample.com.pem key=/etc/squid/ssl_cert/private/www.sample.com.pem
Note you may need to change the “cert=” and the “key=” to point to the correct file in your environment. Also of course you will need to change the IP address
The first directive (always_direct) is due to SslBump. By default ssl_bump is set to accelerator mode. In debug logs cache.log you’d see “failed to select source for”. In accelerator mode, the proxy does not know which backend server to use to retrieve the file from, so this directive instructs the proxy to ignore the accelerator mode. More details on this here:
The second directive (ssl_bump) instructs the proxy to allow all SSL connections, but this can be modified to restirct access. You can also use the “sslproxy_cert_error” to deny access to sites with invalid certificates. More details on this here:
Start squid and check for any errors. If no errors are reported, run:
netstat -nap | grep 3129
to make sure the proxy is up and running. Next, configure iptables to perform destination NAT, basically to redirect the traffic to the proxy:
iptables -t nat -A PREROUTING -i eth0 -p tcp –dport 80 -j DNAT –to-destination 126.96.36.199:3128
iptables -t nat -A PREROUTING -i eth0 -p tcp –dport 443 -j DNAT –to-destination 188.8.131.52:3129
Last thing to be done was to either place the proxy physically in line with the traffic or to redirect the traffic to the proxy using a router. Keep in mind that the proxy will change the source IP address of the requests to it’s own IP. In other words, by default it does not reflect the client IP.
That was it in my case. I did try to implement something similar to the above but using explicit mode. This was my squid.conf file, note only one port is needed for both HTTP and HTTPS since HTTPS is tunneled over HTTP using the CONNECT method:
always_direct allow all
ssl_bump allow all
#the below should be placed on a single line
http_port 8080 ssl-bump cert=/etc/squid/ssl_cert/proxy.testdomain.deCert.pem key=/etc/squid/ssl_cert/private/proxy.testdomain.deKey_without_Pp.pem
As regards my previous discussion of notifying users that they are being monitored, consider using greasyspoon:
With this in place, you can instruct greasyspoon to send a notify page to the clients. If they accept this notify page, a cookie (let’s say the cookie is called “NotifySSL”) is set. GreasySpoon can then check for the presence of this cookie in subsequent requests and if present, allow the connection. If the cookie is not present, customers again get the notify page. Due to security considerations, most of the time cookies are only valid for one domain, so you may end up with users having to accept the notify for each different domain they visit. But, you can use greasyspoon in conjunction with a backend MySQL database or similar to record source IP addresses that have been notified and perform IP based notifications. Anything is possible