Learning Academy

The Cure of Ignorance is to Question. MUHAMMAD (PBUH)

Content Filtering

SquidGuard, My Favorite Content Filter

squidGuardIntroduction

squidGuard is a combined filter, redirector and access controller plugin for Squid. It is

  • free (GPLv2)
  • very flexible 
  • extremely fast *)
  • easily installed
  • portable

squidGuard can be used to

  • limit the web access for some users to a list of accepted/well known web servers and/or URLs only.
  • block access to some listed or blacklisted web servers and/or URLs for some users. **)
  • block access to URLs matching a list of regular expressions or words for some users. **)
  • enforce the use of domainnames/prohibit the use of IP address in URLs. **)
  • redirect blocked URLs to an “intelligent” CGI based info page. **)
  • redirect unregistered user to a registration form.
  • redirect popular downloads like Netscape, MSIE etc. to local copies.
  • redirect banners to an empty GIF. **)
  • have different access rules based on time of day, day of the week, date etc.
  • have different rules for different user groups.
  • and much more..

Neither squidGuard nor Squid can be used to

  • filter/censor/edit text inside documents
  • filter/censor/edit embeded scripting languages like JavaScript or VBscript inside HTML

 

*) 100,500 requests in 2 seconds on a AMD Athlon 3700+ with lists of 5643 domains 7442 urls 13780 total
**) squidGuard is not a porn or banner filter/blocker, but it is very well suited for these purposes too.

Capabilities

squidGuard has many powerful configuration options that lets you:

1)   define different time spaces based on any reasonable combination of

  • time of day (00:00-08:00 17:00-24:00)
  • day of the week (sa)
  • date (1999-05-13)
  • date range (1999-04-01-1999-04-05)
  • date wildcards (*-01-01 *-05-17 *-12-25)

 

2)   group sources (users/clients) into distinct categories like “managers”, “employees”, “teachers”, “students”, “customers”, “guests” etc. based on any reasonable combination of

  • IP address ranges with
  • prefix notation (172.16.0.0/12)
  • netmask notation (172.16.0.0/255.240.0.0)
  • first-last notation (172.16.0.11-172.16.0.35)
  • address lists (172.16.134.54 172.16.156.23 …)
  • domain lists (foo.bar.com …) *)
  • user id lists (weho sdgh dfhj asef …) **)
  • and optionally link the group to a given time space
  • positively (within business-hours)
  • negatively (outside leisure-time)

 

3)   group destinations (URLs/servers) into distinct categories like “local”, “customers”, “vendors”, “banners”, “banned” etc. based on

  • an unlimited number of unlimited lists of
  • domains, including subdomains (foo.bar.com)
  • hosts (host.foo.bar.com)
  • directory URLs, including subdirectories (foo.bar.com/some/dir)
  • file URLs (foo.bar.com/somewhere/file.html)
  • regular expressions ((expr1|expr2|…))

and optionally link the group to a given time space:

  • positively (within business-hours)
  • negatively (outside leisure-time)

 

4)   rewrite/redirect URLs based on any reasonable combination of

string/regular expression editing à la sed with

  • silent squid redirecting rewrite (s@from@to@[i])
  • visible client redirecting rewrite (s@from@to@[i]r) ***)

URL replacement with

  • silent squid redirect to a common URL (redirect “new_url”)
  • visible client redirect to a common URL (redirect “302:new_url”) ***)

activated by

  • 1-1 URL redirection
  • destination group match
  • a fallback/default for blocked URLs
  • a fallback/default for blocked/unknown clients

and optionally with

  • runtime string substitution à la strftime or printf

5)   define access control lists (acl) based on any reasonable combination of the definitions above by

giving each source (user/client) group

a pass list with any reasonable combination of

  • acceptable destination groups (good-dests …)
  • unacceptable destination groups (!bad-dests …)
  • block IP address URLs (enforce the use of domain names) (!in-addr)
  • wildcards/nothing (any|all|none)

optionally a common rewrite rule set for the source group
optionally a default replacement URL for blocked destinations for the source group

and optionally:
link the acl to a given time space

  • positively (within business-hours)
  • negatively (outside leisure-time)

defining a fallback/default ruleset

6)   have selective logging by optional log statements in the: ****)

  • source/client group declarations to log all translations for the group (log “file”)
  • destination group declarations. Typically used to log blacklist matches. (log “file”)
  • rewrite rule group declarations to log all translations for the rule set (log “file”)

and optionally anonymized to protect the individuals (log anonymous “file”)

*) Client access control based on domain name requires enabling reverse lookups (log_fqdn on) in squid.conf.
**) Client access control based on user id requires enabling RFC931/ident in squid.conf.
***) Note: Visible redirects (302:new-url) are not supported by some interim versions of Squid (presumably 1.2-2.0).
****) Note: squidGuard is smart enough to open only one filedescriptor per logfile (i.e. not necessarily one per log statement); per spawned process of course. Though logging to too many different files may exeed your system’s concurrent filedescriptor limit.

Portability

squidGuard should compile right out of the box on any modern brand of UNIX with a development environment and a recent version (4.X, older versions are supported, too) of the Berkeley DB library. squidGuard was initially developed on Sun Solaris-2.8 with gcc-2.95.3, bison-1.25, flex-2.5.4. It is now maintained and developed under Gentoo Linux with latest versions of gcc, flex and bison.

In the past users have reported success on at least, but not limited to:

  • AIX: 4.1.3, 4.3.2.0/egcs-2.91.66
  • Dec-Unix: OSF1-4.0/gcc-2.7.2.3, 3.2C/gcc-2.7.2.3
  • FreeBSD 4.x-STABLE gcc 2.95.3
  • Linux: RedHat-5.2/gcc-2.8.1 RedHat-7.x/gcc-2.8.1, Gentoo 1.12.6/gcc-3.3.6
  • Solaris: 2.6/gcc-2.7.2.3 2.6/gcc-2.95.3, 2.8/gcc-2.95.3
  • CentOS: 4.4

SOURCE: http://www.squidguard.org/about.html

Muhammad Shaukat

Content Developer at LearnAcad.com

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Pin It on Pinterest