Friday, January 8, 2010

DansGuardian Content Filtering with AD Integration

For our web/content filtering at work, we’ve used DansGuardian for several years with good success.  Originally, we used a small service called identD on each of our Windows 2000 and XP PC’s.  When Vista came out, the service failed to install.  We were also running into complications as we were starting to move towards multi-user Terminal Services environments (the identD service was machine-centric, not user-centric).
I toyed with porting the service over to .NET to run on Vista, but ultimately, I decided that it was time to bite the bullet and use a built-in authentication method called NTLM, which could be facilitated by the Squid proxy server.
The following steps were used to configure the system on a Debian 4.0 (Etch) server. The process was largely borrowed from an article on HowToForge.

Theory of Operation

DansGuardian is a web filtering program that watches the content of a web page, and based on a number of criteria, decides whether or not to block the page. Unfortunately, DansGuardian makes it very clear that it is NOT a proxy or transport program, and therefore needs a very close tie-in with the Squid proxy server.
Traditionally, DansGuardian is configured to listen on port 8080, and Squid is configured to listen to the localhost on port 3128. A client PC is then setup to traverse through the proxy host on port 8080. When a request is made, Dansguardian passes the request on to Squid on port 3128, and filters the content bases on Squid’s reply.
Dansguardian flow chart - old
In order to introduce NTLM Authentication into the process, we have to utilize Squid’s NTLM_Auth functionality, and therefore the PC needs to talk directly to Squid. Now the PC talks directly to the Squid, which handles the NTLM authentication, Squid passes the request (and username information) to Dansguardian, which in turn requests the information back from Squid on the secondary port 3128. The second request to Squid is what is actually passed on to the Internet for its reply.

Dansguardian flow chart
The solution is a bit complicated, but in general works quite well with Internet Explorer and Firefox. The problem comes into play when a browser is not NTLM-enabled. This actually happens more frequently than you might think in real life, as most of the Java runtimes don’t seem to be compliant. This causes a number of problems when it comes time to run various Java applets over the Internet, or when trying to hold a Java-based webinar.
Technically, it will work, as Squid is designed to fail-back to a Basic Authentication and the browser will prompt the user for their domain login credentials, however that is annoying to the user at best, and a bad security practice at worst. It really isn’t a good idea to tell staff that it is okay or even normal to provide their domain credentials whenever a web application asks for them.
Hence, we created an ACL group within the Squid configuration that is coded to look for the headers from a group of known non-compliant “browsers” such as Java and Google’s Chrome. This workaround is explained in further detail throughout the rest of this article.

A Few Assumptions

For the sake of example, I will use the following names when defining my network going forward.
  • acme.local – This is the local (internal) domain suffix.
  • ACME – The old NetBIOS name for the domain.
  • etch1.acme.local – This is the Debian (Etch) server that DansGuardian is being installed on.
  • dc1.acme.local – This is a Microsoft Server Active Directory Domain Controller.
  • / – Class C subnet the PC’s are located in.
  • – Windows DNS name server 1.
  • – Windows DNS name server 2.

Install the Necessary Packages

This article assumes that a Debian 4.0 (Etch) system is up and running, with basic network connectivity established. For the enhanced DansGuardian logging features to be used (covered in a separate post), Apache, MySQL, and PHP will need to be installed, and it is helpful to have phpMyAdmin available for configuration and testing of the database.
To install and setup the filtering proxy server, use aptitude to install the following packages.

  • squid

  • dansguardian

  • samba

  • winbind

  • krb5-user

  • ntp

  • ntpdate
During the installation, it will prompt you for the following parameters. Answer them as follows:
Please specify the workgroup you want this server to appear to be in when queried by clients. acme.local
Modify smb.conf to use WINS settings from DHCP? No
Kerberos servers for your realm: dc1.acme.local
Administrative server for your Kerberos realm: dc1.acme.local
Once all of the packages are installed, run the following command to configure the Kerberos.
dpkg-reconfigure krb5-config
It will then prompt you with the following questions. Answer them as follows:
Default Kerberos version 5 realm: acme.local
Does DNS contain pointers to your realm's Kerberos Servers? Yes

Configure the Name Resolution

We aren’t using the proxy server as a direct gateway to the Internet (with multiple NIC’s and such), so at this point, we deviated from the HowToForge article. In our case, it’s important that the /etc/resolv.conf file is pointed at our internal DNS servers, and that the search domain is configured correctly. Our resolv.conf file should look like this:
search acme.local

Synchronize the Time with the Windows Domain

Next we need to configure the NTP client to pull from our internal domain controller. Edit the /etc/ntp.conf file, and find the section near the top that lists the server(s). Be sure the default servers are commented out, and add a new line:
server dc1.acme.local iburst
Now that the client is configured, initiate a synchronization with the following command:
net time set –S dc1.acme.local

Configure Samba

It’s always a good idea to make a backup copy of the original configuration file that is installed with a new package. Make a backup copy, and vi the original file.
cp /etc/samba/smb.conf /etc/samba/smb.conf.original

Make the following changes to the /etc/samba/smb.conf

  • On line 53, set the interfaces =

  • Uncomment line 59.

  • Uncomment line 91 and change to security = ads.

  • Uncomment lines 204 and 205.

  • Add the following lines before line 217:
winbind trusted domains only = yes
winbind cache time = 3600
Now restart the Samba and Winbind daemons.
/etc/init.d/samba restart
/etc/init.d/winbind restart

Join the Domain

To join the domain, use the net command
net ads join –U Administrator
As I recall, it prompts you for a user’s credentials who has the authority to join a domain. This can be any Domain Admin account that has privileges to join the domain, such as Administrator.
To test the join, you can issue the following commands and see their output:
etch1:~# wbinfo –t
checking the trust secret via RPC calls succeeded
etch1:~# wbinfo –u


wbinfo –g
ACME\domain computers
ACME\group policy creator owners
ACME\domain guests
At this point, you can be assured that the Linux box is a member of the domain, and that it is talking to the domain controllers.
You can also launch the Active Directory Users and Computers snap-in on a Windows PC, and expand the Computers branch. You should see the name of the Linux server listed as a “Computer”.

Configure the Squid Proxy Server

This is the section that I had the most trouble with. I wound up making a copy of the original configuration file with all of the default settings and comments, and then stripped them all out for my running configuration. It just got to be a hassle scanning through 3000 lines of comments for 30 lines of configuration.
cp /etc/squid/squid.conf /etc/squid/squid.conf.original
Then I used a grep command to eliminate the comments.
grep –v “^#” squid.conf > squid.conf.clean
This left a lot of blank lines in the file, but they were easy enough to remove with UltraEdit. After the configuration changes, I was left with the following configuration file:
# Squid configuration file -- Stripped of comments for clarity

# There are actually two proxies running - 1 for Dansguardian
# (from localhost) and the other for the masses
# The transparent proxy is bound to the localhost IP and listens on 3128
http_port transparent

# This one is bound to all IP's, and listens on port 8080. Port 8080
# is the default Dansguardian port. In
# our case, Dans has been reconfigured to use port 8081 instead to
# avoid confusion.
http_port 8080

# This parameter tells squid to pass the login credentials through to Dans
cache_peer parent 8081 0 no-query login=*:nopassword

# The following 7 lines are default Squid configuration
hierarchy_stoplist cgi-bin ?
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
access_log /var/log/squid/access.log squid
hosts_file /etc/hosts

# The following 3 lines configure NTLM authentication for browsers.
# This is the primary method used for proxy authentication
auth_param ntlm program /usr/bin/ntlm_auth --helper-protocol=squid-2.5-ntlmssp
auth_param ntlm children 5
auth_param ntlm keep_alive on

# This is a failsafe authentication in case the client application
# doesn't support NTLM. It uses Basic
# authentication and still authenticates off of the same ntlm_auth piece
auth_param basic program /usr/bin/ntlm_auth --helper-protocol=squid-2.5-ntlmssp
auth_param basic children 5
auth_param basic realm Squid proxy-caching web server
auth_param basic credentialsttl 2 hours

# The following 25 lines are default Squid configuration
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern . 0 20% 4320
acl all src
acl manager proto cache_object
acl localhost src
acl to_localhost dst
acl SSL_ports port 443 # https
acl SSL_ports port 563 # snews
acl SSL_ports port 873 # rsync
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl Safe_ports port 631 # cups
acl Safe_ports port 873 # rsync
acl Safe_ports port 901 # SWAT
acl purge method PURGE

# These are custom configurations for our environment.
# First we are creating an ACL group for people who were
# authenticated by the NTLM
acl ntlm_users proxy_auth REQUIRED

# This is a generic ACL of valid IP addresses on our network
# that have access to the proxy
acl our_networks src

# Some browsers don't support NTLM authentication. Rather
# than harass the user with pop-up's, we are excepting
# out known browser issues from the NTLM credentials.
# We know that Java generally does not support NTLM
# (although some newer versions may)
acl non_ntlm browser Java/1.4 Java/1.5 Java/1.6

# Oddly enough, Google's Chrome browser does not support NTLM
# authentication
acl non_ntlm browser Chrome

# The following 6 lines are default Squid configuration
http_access allow manager localhost
http_access deny manager
http_access allow purge localhost
http_access deny purge
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports

# Now we're actually allowing appropriate users access the proxy.
# The first step is to except out the non_ntlm browsers that
# were defined above. This bypasses that authentication
# scheme before it gets to the allowance of ntlm_users
http_access allow non_ntlm

# We want the localhost to be able to proxy
http_access allow localhost

# And finally, this is the line that allows anyone on
# our network, that has been authenticated by the NTLM piece to
# get through. It's not real intuitive, but it seems
# that it only authenticates the browser when it actually gets
# to this line. In other words, non_ntlm browsers that
# were allowed above don't get prompted.

# Note that any browser that bypasses the NTLM authentication
# will show up in the logs without a username.
http_access allow our_networks ntlm_users

# The following 4 lines are default Squid configuration
http_reply_access allow all
icp_access allow all
cache_effective_group proxy
coredump_dir /var/spool/squid
The important lines are near the top. I’ve included my own comments to explain what was actually changed and why I did that.
At this point, you can start the Squid proxy server, although it won’t really do anything until DansGuardian is setup.
/etc/init.d/squid restart

Set the Permissions for the winbindd_privileged Directory

Squid needs access to /var/run/samba/winbindd_privileged. We can easly fix this but the permissions will reset when we reboot. Jesse Waters on posted a script that will set the permissions on every system boot.
Create a file named /etc/init.d/ and paste the following into it.
#set –x

chmodgrp() {
  chgrp proxy $WINBINDD_PRIVILEGED || return 1
  chmod g+w $WINBINDD_PRIVILEGED || return 1

case "$1" in

echo "Error: argument '$1' not supported" >&2
exit 3

echo "Usage: $0 start|stop" >&2
exit 3
Add this script to the init scripts to run at boot time by running this command:
update-rc.d start 21 2 3 4 5 .
Go ahead and execute this script to set the permissions.
/etc/init.d/ start

Configure the DansGuardian Web Filter

Again, I deviated quite a bit from the original article, primarily because I already had quite a bit of experience with DansGuardian.
cp /etc/dansguardian/dansguardian.conf /etc/dansguardian/dansguardian.conf.original
Now vi the dansguardian.conf file and make the following changes:

  • On line 3, comment out the item that says “UNCONFIGURED…”

  • On line 44, change the logfileformat to 2 (CSV-style format)

  • On line 62, change the filterport to 8081

  • On line 102, I set my filtergroups to 3. We have three groups in our environment – IT, Management, and everyone else. IT gets nearly everything like .EXE’s, .ZIP’s, .DOC’s, .XLS’s. Then we have Management which gets a little less like only .DOC’s, and .XLS’s. And finally, we have everyone else who don’t get to download many of the privileged file types like Microsoft Office types or executables.
Configure the three groups as appropriate. Start by defining the groups. Vi the /etc/dansguardian/filtergroupslist file. Our list looks like this:
#filter1 = DEFAULT USERS (Everyone in this group unless otherwise noted)
#filter2 = IT Users (Can do just about anything)

#filter3 = Management (Most things except EXE's)
Now begin customizing the setup of the individual groups by vi’ing the dansguardianf1.conf file: Make the following changes:
Near the top, set the Banned Extension List and Banned MIME Type List to be specific to the filter group.
bannedextensionlist = ‘/etc/dansguardian/filter1/bannedextensionlist'
bannedmimetypelist = '/etc/dansguardian/filter1/bannedmimetypelist'
I have our naughtynesslimit set to 125, which seems to be appropriate for a workplace environment.
Now make duplicate dansguardianf1.conf files for each of the three groups.
cp /etc/dansguardian/dansguardianf1.conf /etc/dansguardian/dansguardianf2.conf
cp /etc/dansguardian/dansguardianf1.conf /etc/dansguardian/dansguardianf3.conf
Go in a edit both of the dansguardianf2.conf and dansguardianf3.conf files to have the appropriate filter directory for the Banned Extension List and Banned MIME Type List. For example, dansguardianf2.conf should read:
bannedextensionlist = ‘/etc/dansguardian/filter2/bannedextensionlist'
bannedmimetypelist = '/etc/dansguardian/filter2/bannedmimetypelist'
Create the three filter directories and copy the two files from the root DansGuardian directory into these new folders.
mkdir /etc/dansguardian/filter1
mkdir /etc/dansguardian/filter2
mkdir /etc/dansguardian/filter3

cp /etc/dansguardian/bannedextensionlist /etc/dansguardian/filter1
cp /etc/dansguardian/bannedmimetypelist /etc/dansguardian/filter1
cp /etc/dansguardian/bannedextensionlist /etc/dansguardian/filter2
cp /etc/dansguardian/bannedmimetypelist /etc/dansguardian/filter2
cp /etc/dansguardian/bannedextensionlist /etc/dansguardian/filter3
cp /etc/dansguardian/bannedmimetypelist /etc/dansguardian/filter3
For each of the three groups, edit the /etc/dansguardian/filter1/bannedextensionlist and bannedmimetypelist as appropriate. The included list files that come with DansGuardian include a list of virtually every possible extension. If it’s listed in the file, then those users won’t be able to download that type. Customize the list by commenting out extensions that you DO want these users to be able to download.
I recommend that management has at least .DOC, and .XLS capabilities, or else they will be calling IT on a regular basis.
IT generally has unique needs, and so they should additionally be given access to download .ZIP, .EXE, and .ISO files.
Finally, we found that some websites consistently come up as blocked, even though they’re legitimate or even directly business related. For those sites, edit the /etc/dansguardian/exceptionsitelist and add the domain name for the offending servers. We currently have about 120 domains that we are explicitly excepting out of the filter for various reasons. Some general examples are (for patches), (for Acrobat Reader) and so on.

Configuring the Client PC’s

The only configuration that we have to do on Windows 2000 or Windows XP machines is to configure the proxy server settings. There is a DNS alias A record that points the proxy.acme.local address back to the Etch1 server. The actual settings are normally done from within Desktop Authority with a KIX script named win-ie.kix.
To manually make the change, open Internet Explorer. Click on Tools and click on Internet Options. Select the Connections tab from the top of the screen. Click on LAN settings. On the bottom half of the screen, check to box to Use a proxy server… and set the address to proxy.acme.local. The port should be set to 8080.
For Windows Vista and Windows 7 machines, there is one additional step that is required. By default, Vista doesn’t allow NTLM version 1 traffic to pass. Unfortunately, the Samba project is not quite ready to support version 2, so we have to reconfigure Vista PC’s.
Go to Start-Run and type in secpol.msc. The Local Security Policy manager screen will come up. Drill down into Local Policies, and then click on Security Options. In the list of the right will be Network security: LAN Manager authentication level. Change this option to Send LM & NTLM – use NTLMv2 session security if negotiated. Click OK and you can close out of the Local Security Policy editor.

Troubleshooting Domain Connectivity/Authentication

To troubleshoot the domain connectivity, refer to the section titled Join the Windows Domain. First of all, you should see the server name listed in the Active Directory Users and Computers. Seeing this indicates that your server has registered with the domain. If it won’t, it’s possible that Samba wasn’t configured, or that the /etc/resolv.conf is not pointing to the correct internal DNS servers.
If you haven’t done so recently, it’s never a bad idea to restart the Samba and Winbind daemons.
/etc/init.d/samba restart
/etc/init.d/winbind restart
Another potential problem could occur if the server has the same name as a previous machine on the domain. It may become necessary to delete the computer’s registration from the domain by opening Active Directory Users and Computers, drilling down into Computers, and delete the computer that has the name of this server. At that point, you would need to re-join the domain by issuing:
net ads join –U Administrator
Running the wbinfo command with the –t, -u, and –g should enumerate all of the users and groups on the domain. You can also manually attempt an authentication of a Windows domain account by issuing the ntlm_auth command.
ntlm_auth --username=ituser1 --domain=ACME
It will prompt you for the password for this selected account, and assuming that the username and password was correct, it should return:
NO_STATUS_OK: Success (0x0)
If Squid suddenly stops authenticating users, but the above commands continue to work, then it is probably a permissions issue between Squid and Winbind. The /etc/init.d/ script shown in the Set the Permissions for the winbindd_privileged Directory may not be running, or may not be running at the correct point during startup. You can manually run that script at any time by issuing:
/etc/init.d/ start

Troubleshooting Squid Configuration

First of all, be sure that the server has Internet connectivity. It should be able to ping a site such as This tells you that DNS name resolution is working, as well as the routing out to a foreign network.
If you are having troubles with getting Squid running, it’s best to bypass all authentication schemes. To do so, make the following changes in the /etc/squid/squid.conf file.

  • Comment out all seven lines starting with auth_param to disable NTLM and Basic authentication

  • In the ACL’s section, replace the line http_access allow out_networks ntlm_users with the following:
http_access allow our_networks
Then restart Squid.
/etc/init.d/squid restart

Troubleshooting DansGuardian

Dansguardian itself has proven to be very straightforward in terms of configuration. In the past, we have seem some instability over long periods of time, so I generally have a cron job setup to restart the service in the middle of the night. I add the following line to my /etc/crontab file.
30 1 * * * root /etc/init.d/dansguardian restart

Client Connectivity Issues

Be sure that the proxy server is configured on the web browser. In Internet Explorer, it’s under Tools-Internet Options, Connections tab. Click on LAN settings. Be sure you have the Fully Qualified Domain Name listed, as well as port 8080. For some sites, especially internal intranets, it may be necessary to create exceptions under the Advanced button.
If you go to a website, and a box now pops up asking for login credentials, it means that either your browser isn’t configured to support NTLM authentication, Squid isn’t configured correctly, or your PC isn’t talking in NTLM version 1.
The incompatible browser tends to be the most common. Sometimes it’s not the browser itself, but a plug-in that gets called such as Java. In that case, you will need to determine the browser User-Agent string that is being passed to the Squid, and add it to the list of non_ntlm ACL exceptions. If you aren’t sure what the agent name that is being passed is, then the simplest way is to use Wireshark to sniff the traffic. You will probably see something similar to this in the trace, usually in the first packet that is being sent to the server.
User-Agent: Mozilla/4.0…
If all browsers are having the problem, including a modern version of Internet Explorer, you may have NTLM version 1 disabled, especially if this is a Vista PC. If this is the case, see the section title Configuring the Client PC where it talks about using secpol.msc to change the network security parameter.


This configuration of DanGuardian has been in production for us for about a year now, and is working just fine.  The trickiest part of this is really understanding the two portions of the Squid proxy server, and how they interact with DansGuardian.