Benutzer-Werkzeuge

Webseiten-Werkzeuge


scripting:bash:privoxy-blocklist

privoxy-blocklist.sh

Summary

This script downloads, converts and installs AdblockPlus lists into Privoxy.

Main

Tracker for bugs and feature requests:
Bugtracker: privoxy-blocklist.sh

The PKGBUILD for this script can be found here:
http://aur.archlinux.org/packages.php?ID=43861
And the package can be installed from my repository.
It installs the script to '/usr/sbin/' and also a cronjob to '/etc/cron.weekly/'.

Warranty

Warranty

There is no warranty that this setup works for any system which provide the dependencies listed below.
I'm just providing these information because it worked for me.
If you have questions you can leave a message here but I decide whether I'll answer and help or not.

Security - Connection

Security issue

If you use the described setup you'll copy files using an unencrypted connection.
That is why it is possible to sniff data about you, your systems and your infrastructure.
You should only use it within a secure network and only if you trust everybody who is using it.

Dependencies:

  • bash
  • wget
  • sed
  • privoxy

Version history:
0.2:

  • implemented ## and ###
  • added debugging option and functions
  • added quiet-option
  • added help

There was a change of the filenames needed. So to ensure that privoxy is working properly you have to remove all .action and .filter files BUT user.*, match-all.* and default.* in the configuration directory of your privoxy and in the config.
0.1:

  • define a list of URL for AdblockPlus lists
  • convert AdblockPlus lists into Privoxy format
  • install lists into config directory of Privoxy
  • modify config file of Privoxy for using new lists without restarting Privoxy

Bugs/Requests

Script

The development version of this script can be found on github.com

privoxy-blocklist.sh
#!/bin/bash
#
######################################################################
#
#                  Author: Andrwe Lord Weber
#                  Mail: lord-weber-andrwe<at>renona-studios<dot>org
#                  Version: 0.2
#                  URL: http://andrwe.dyndns.org/doku.php/blog/scripting/bash/privoxy-blocklist
#
##################
#
#                  Sumary: 
#                   This script downloads, converts and installs
#                   AdblockPlus lists into Privoxy
#
######################################################################
 
######################################################################
#
#                 TODO:
#                  - implement:
#                     domain-based filter
#
######################################################################
 
######################################################################
#
#                  script variables and functions
#
######################################################################
 
# array of URL for AdblockPlus lists
URLS=("https://easylist-downloads.adblockplus.org/easylistgermany.txt" "http://adblockplus.mozdev.org/easylist/easylist.txt")
# privoxy config dir (default: /etc/privoxy/)
CONFDIR=/etc/privoxy
# directory for temporary files
TMPDIR=/tmp/privoxy-blocklist
TMPNAME=$(basename ${0})
 
######################################################################
#
#                  No changes needed after this line.
#
######################################################################
 
function usage()
{
	echo "${TMPNAME} is a script to convert AdBlockPlus-lists into Privoxy-lists and install them."
	echo " "
	echo "Options:"
	echo "      -h:    Show this help."
	echo "      -q:    Don't give any output."
	echo "      -v 1:  Enable verbosity 1. Show a little bit more output."
	echo "      -v 2:  Enable verbosity 2. Show a lot more output."
	echo "      -v 3:  Enable verbosity 3. Show all possible output and don't delete temporary files.(For debugging only!!)"
	echo "      -r:    Remove all lists build by this script."
}
 
[ ${UID} -ne 0 ] && echo -e "Root privileges needed. Exit.\n\n" && usage && exit 1
 
# check whether an instance is already running
[ -e ${TMPDIR}/${TMPNAME}.lock ] && echo "An Instance of ${TMPNAME} is already running. Exit" && exit
 
DBG=0
 
function debug()
{
	[ ${DBG} -ge ${2} ] && echo -e "${1}"
}
 
function main()
{
	cpoptions=""
	[ ${DBG} -gt 0 ] && cpoptions="-v"
 
	for url in ${URLS[@]}
	do
		debug "Processing ${url} ...\n" 0
		file=${TMPDIR}/$(basename ${url})
		actionfile=${file%\.*}.script.action
		filterfile=${file%\.*}.script.filter
		list=$(basename ${file%\.*})
 
		# download list
		debug "Downloading ${url} ..." 0
		wget -t 3 --no-check-certificate -O ${file} ${url} >${TMPDIR}/wget-${url//\//#}.log 2>&1
		debug "$(cat ${TMPDIR}/wget-${url//\//#}.log)" 2
		debug ".. downloading done." 0
		[ "$(grep -E '^\[Adblock.*\]$' ${file})" == "" ] && echo "The list recieved from ${url} isn't an AdblockPlus list. Skipped" && continue
 
		# convert AdblockPlus list to Privoxy list
		# blacklist of urls
		debug "Creating actionfile for ${list} ..." 1
		echo -e "{ +block{${list}} }" > ${actionfile}
		sed '/^!.*/d;1,1 d;/^@@.*/d;/\$.*/d;/#/d;s/\./\\./g;s/\?/\\?/g;s/\*/.*/g;s/(/\\(/g;s/)/\\)/g;s/\[/\\[/g;s/\]/\\]/g;s/\^/[\/\&:\?=_]/g;s/^||/\./g;s/^|/^/g;s/|$/\$/g;/|/d' ${file} >> ${actionfile}
		debug "... creating filterfile for ${list} ..." 1
		echo "FILTER: ${list} Tag filter of ${list}" > ${filterfile}
		# set filter for html elements
		sed '/^#/!d;s/^##//g;s/^#\(.*\)\[.*\]\[.*\]*/s|<([a-zA-Z0-9]+)\\s+.*id=.?\1.*>.*<\/\\1>||g/g;s/^#\(.*\)/s|<([a-zA-Z0-9]+)\\s+.*id=.?\1.*>.*<\/\\1>||g/g;s/^\.\(.*\)/s|<([a-zA-Z0-9]+)\\s+.*class=.?\1.*>.*<\/\\1>||g/g;s/^a\[\(.*\)\]/s|<a.*\1.*>.*<\/a>||g/g;s/^\([a-zA-Z0-9]*\)\.\(.*\)\[.*\]\[.*\]*/s|<\1.*class=.?\2.*>.*<\/\1>||g/g;s/^\([a-zA-Z0-9]*\)#\(.*\):.*[:[^:]]*[^:]*/s|<\1.*id=.?\2.*>.*<\/\1>||g/g;s/^\([a-zA-Z0-9]*\)#\(.*\)/s|<\1.*id=.?\2.*>.*<\/\1>||g/g;s/^\[\([a-zA-Z]*\).=\(.*\)\]/s|\1^=\2>||g/g;s/\^/[\/\&:\?=_]/g;s/\.\([a-zA-Z0-9]\)/\\.\1/g' ${file} >> ${filterfile}
		debug "... filterfile created - adding filterfile to actionfile ..." 1
		echo "{ +filter{${list}} }" >> ${actionfile}
		echo "*" >> ${actionfile}
		debug "... filterfile added ..." 1
		debug "... creating and adding whitlist for urls ..." 1
		# whitelist of urls
		echo "{ -block }" >> ${actionfile}
		sed '/^@@.*/!d;s/^@@//g;/\$.*/d;/#/d;s/\./\\./g;s/\?/\\?/g;s/\*/.*/g;s/(/\\(/g;s/)/\\)/g;s/\[/\\[/g;s/\]/\\]/g;s/\^/[\/\&:\?=_]/g;s/^||/\./g;s/^|/^/g;s/|$/\$/g;/|/d' ${file} >> ${actionfile}
		debug "... created and added whitelist - creating and adding image handler ..." 1
		# whitelist of image urls
		echo "{ -block +handle-as-image }" >> ${actionfile}
		sed '/^@@.*/!d;s/^@@//g;/\$.*image.*/!d;s/\$.*image.*//g;/#/d;s/\./\\./g;s/\?/\\?/g;s/\*/.*/g;s/(/\\(/g;s/)/\\)/g;s/\[/\\[/g;s/\]/\\]/g;s/\^/[\/\&:\?=_]/g;s/^||/\./g;s/^|/^/g;s/|$/\$/g;/|/d' ${file} >> ${actionfile}
		debug "... created and added image handler ..." 1
		debug "... created actionfile for ${list}." 1
 
		# install Privoxy actionsfile
		cp ${cpoptions} ${actionfile} ${CONFDIR}
		if [ "$(grep $(basename ${actionfile}) ${CONFDIR}/config)" == "" ] 
		then
			debug "\nModifying ${CONFDIR}/config ..." 0
			sed "s/^actionsfile user\.action/actionsfile $(basename ${actionfile})\nactionsfile user.action/" ${CONFDIR}/config > ${TMPDIR}/config
			debug "... modification done.\n" 0
			debug "Installing new config ..." 0
			cp ${cpoptions} ${TMPDIR}/config ${CONFDIR}
			debug "... installation done\n" 0
		fi	
		# install Privoxy filterfile
		cp ${cpoptions} ${filterfile} ${CONFDIR}
		if [ "$(grep $(basename ${filterfile}) ${CONFDIR}/config)" == "" ] 
		then
			debug "\nModifying ${CONFDIR}/config ..." 0
			sed "s/^\(#*\)filterfile user\.filter/filterfile $(basename ${filterfile})\n\1filterfile user.filter/" ${CONFDIR}/config > ${TMPDIR}/config
			debug "... modification done.\n" 0
			debug "Installing new config ..." 0
			cp ${cpoptions} ${TMPDIR}/config ${CONFDIR}
			debug "... installation done\n" 0
		fi	
 
		debug "... ${url} installed successfully.\n" 0
	done
}
 
# create temporary directory and lock file
mkdir -p ${TMPDIR}
touch ${TMPDIR}/${TMPNAME}.lock
 
# set command to be run on exit
[ ${DBG} -le 2 ] && trap "rm -fr ${TMPDIR};exit" INT TERM EXIT
 
# loop for options
while getopts ":hrqv:" opt
do
	case "${opt}" in 
		"h")
			usage
			exit 0
			;;
		"v")
			DBG="${OPTARG}"
			;;
		"q")
			DBG=-1
			;;
		"r")
			echo "Do you really want to remove all build lists?(y/N)"
			read choice
			[ "${choice}" != "y" ] && exit 0
			rm -rf ${CONFDIR}/*.script.{action,filter} && \
			sed '/^actionsfile .*\.script\.action$/d;/^filterfile .*\.script\.filter$/d' -i ${CONFDIR}/config && \
			echo "Lists removed." && exit 0
			echo -e "An error occured while removing the lists.\nPlease have a look into ${CONFDIR} whether there are .script.* files and search for *.script.* in ${CONFDIR}/config."
			exit 1
			;;
		":")
			echo "${TMPNAME}: -${OPTARG} requires an argument" >&2
			exit 1
			;;
	esac
done
 
debug "URL-List: ${URLS}\nPrivoxy-Configdir: ${CONFDIR}\nTemporary directory: ${TMPDIR}" 2
main
 
# restore default exit command
trap - INT TERM EXIT
[ ${DBG} -lt 2 ] && rm -r ${TMPDIR}
[ ${DBG} -eq 2 ] && rm -vr ${TMPDIR}
exit 0

Comments

Hello,

I don't think this script works anymore.

1 |
Sebastian
| 2010/11/23 13:12 | reply

@Sebastian:
Could you please explain the problems you have here:
Bugtracker: privoxy-blocklist.sh

2 |
Andrwe Lord Weber
| 2010/11/23 19:39 | reply

Feature request, pls:

Support /usr/local for install prefix.

Support customizable names for action and filter file, i.e.

adblock.filter adblock.action

3 |
cb
| 2010/12/01 10:45 | reply

@cb:
I don't understand what do you mean by supporting /usr/local for install prefix.
You just can copy the script to any place you want and start it.
The only path which is set in the script using a variable is the config directory of privoxy (/etc/privoxy)

4 |
Andrwe Lord Weber
| 2010/12/01 20:25 | reply

[…] privoxy-blocklist.sh […]

Hi Andrwe,

Awesome script! I can finally remove loads of ads while surfing with the iPad.

Is the script still maintained currently? Been a while since the last update. I'm really looking forward to domain based filters, meaning the loads of ||example.com^$third-party filters currently in easylist.

6 |
Casper
| 2011/06/23 13:23 | reply

@Casper: Hi Casper,
Yes the script is still maintained.
In the last few month and unfortunately for the next few I'm very busy with my work.
But I'm definitly trying to update the script this year.

So please be a little more patient.

7 |
Andrwe Lord Weber
| 2011/08/20 13:23 | reply

Andrwe: Thanks for the great script. It's perfect for blocking ads on mobile devices.

I use a third party list to disable social media buttons. Is there any way to modify the script to allow import of a list from a source such as: https://monzta.maltekraus.de/adblock_social.txt

8 |
Wes
| 2011/08/24 02:33 | reply

G'day All

There seem too be an issue with the Size of ya filter per say, a filter can't be more than 4000-4990 characters in size, per filter, any thing more and privoxy will ignore the rest, ya might wish to breakup the extra large filter too a smaller, more manageable size filter's.

P.S I have well over 900 ad-Tracker filter's and each filter can hold as much as 4500 characters each, which run's extremely well and blocking rates are in the high numbers. If ya want sample of a filter let me know!

Cheers

9 |
Andrew
| 2011/09/20 16:27 | reply

@Wes: hi,

you just have to add this source to the URLS-array like this:

URLS=("https://easylist-downloads.adblockplus.org/easylistgermany.txt" "http://adblockplus.mozdev.org/easylist/easylist.txt" "https://monzta.maltekraus.de/adblock_social.txt")
10 |
Andrwe Lord Weber
| 2011/10/23 17:46 | reply

Regarding bug 9: On Ubuntu 11.04, bash version 4.2.8(1), lines 210, 211 and 212 in the current git version are missing && after the ... part, error is privoxy-blocklist.sh: line 210: syntax error near unexpected token `echo' privoxy-blocklist.sh: line 210: `-z "${PRIVOXY_CONF}" echo „\$PRIVOXY_CONF isn't set please either provice a valid initscript config or set it in ${SCRIPTCONF} .“ >&2 && exit 1'

After changing -z "${PRIVOXY_CONF}" echo … to -z "${PRIVOXY_CONF}" && echo … the script works. It is probably not a bad idea to use a more conservative syntax if possible.

Thank You for a great script!

11 |
Mirko
| 2011/11/25 12:26 | reply

The last comment is missing square brackets, but I'm certain You understand the problem.

12 |
Mirko
| 2011/11/25 12:30 | reply

@Mirko

Thanks for reporting. Is fixed.

13 |
Andrwe Lord Weber
| 2011/11/29 18:11 | reply

Hy. I use privoxy on a router with openwrt. when i execute your script i get this error message.

root@OpenWrt /opt# bash blocklist.sh Processing https://easylist-downloads.adblockplus.org/easylistgermany.txt

Downloading https://easylist-downloads.adblockplus.org/easylistgermany.txt … .. downloading done. grep: /opt/privoxy-blocklist/easylistgermany.txt: No such file or directory The list recieved from https://easylist-downloads.adblockplus.org/easylistgermany.txt isn't an AdblockPlus list. Skipped Processing http://adblockplus.mozdev.org/easylist/easylist.txt

Downloading http://adblockplus.mozdev.org/easylist/easylist.txt … Segmentation fault .. downloading done. grep: /opt/privoxy-blocklist/easylist.txt: No such file or directory The list recieved from http://adblockplus.mozdev.org/easylist/easylist.txt isn't an AdblockPlus list. Skipped root@OpenWrt /opt#

What could this be?

14 |
Thomas
| 2012/02/06 19:30 | reply

@14:

Hi,

is there enough space on /opt/ for the lists?

Can you please provide the output of bash blocklist.sh -v 2 ?

It be best if you provide the output using something like pastebin.org because it can be a lot of text.

Andrwe Lord Weber

15 |
Andrwe Lord Weber
| 2012/02/06 20:53 | reply

Hi,

I'm glad I found this script, but I got the error below. It says that the list downloaded from easylist-downloads.adblockplus.org are not AdblockPlus lists and that there are no such files or directory.

Also, I would like to use Privoxy with some porn site's BlackList for my home network such as the ones used with squidguard (http://www.squidguard.org/blacklists.html). Can they be simply added to the script after the URLS-array? As mentionned to wes (message 8). The script will probably not unzip files, is that right?

Thank you

Generated Error Message *

Processing https://easylist-downloads.adblockplus.org/easylistgermany.txt

Downloading https://easylist-downloads.adblockplus.org/easylistgermany.txt … .. downloading done. grep: /tmp/privoxy-blocklist/easylistgermany.txt: No such file or directory The list recieved from https://easylist-downloads.adblockplus.org/easylistgermany.txt isn't an AdblockPlus list. Skipped Processing http://adblockplus.mozdev.org/easylist/easylist.txt

Downloading http://adblockplus.mozdev.org/easylist/easylist.txt … .. downloading done. grep: /tmp/privoxy-blocklist/easylist.txt: No such file or directory The list recieved from http://adblockplus.mozdev.org/easylist/easylist.txt isn't an AdblockPlus list. Skipped

16 |
Lorenzo
| 2012/04/02 22:35 | reply

Following my previous message, here is the privoxy-blocklist.sh -v2 output.

-v2 output *

URL-List: https://easylist-downloads.adblockplus.org/easylistgermany.txt Privoxy-Configdir: /usr/local/etc/privoxy Temporary directory: /tmp/privoxy-blocklist Processing https://easylist-downloads.adblockplus.org/easylistgermany.txt

Downloading https://easylist-downloads.adblockplus.org/easylistgermany.txt … /Users/lorenzo/Desktop/privoxy-blocklist.sh: line 86: wget: command not found .. downloading done. grep: /tmp/privoxy-blocklist/easylistgermany.txt: No such file or directory The list recieved from https://easylist-downloads.adblockplus.org/easylistgermany.txt isn't an AdblockPlus list. Skipped Processing http://adblockplus.mozdev.org/easylist/easylist.txt

Downloading http://adblockplus.mozdev.org/easylist/easylist.txt … /Users/lorenzo/Desktop/privoxy-blocklist.sh: line 86: wget: command not found .. downloading done. grep: /tmp/privoxy-blocklist/easylist.txt: No such file or directory The list recieved from http://adblockplus.mozdev.org/easylist/easylist.txt isn't an AdblockPlus list. Skipped /tmp/privoxy-blocklist/privoxy-blocklist.sh.lock /tmp/privoxy-blocklist/wget-http:##adblockplus.mozdev.org#easylist#easylist.txt.log /tmp/privoxy-blocklist/wget-https:##easylist-downloads.adblockplus.org#easylistgermany.txt.log /tmp/privoxy-blocklist mbp-i5:~ lorenzo$

17 |
Lorenzo
| 2012/04/02 22:39 | reply

So sorry, I just noticed that I was missing the wget dependency. Installation ran successfully but now my browser (well, privoxy) is refusing any connexion.

18 |
Lorenzo
| 2012/04/02 23:24 | reply

After running the script, the new .action line in the config file is not writen properly. It ends up as: „actionsfile easylist.script.actionnactionsfile user.action“ and „filterfile easylist.script.filternfilterfile user.filter“.

Privoxy doesn't accept connexions anymore until reinstal.

config actionsfile match-all.action # Actions that are applied to all sites and maybe overruled later on. actionsfile default.action # Main actions file actionsfile easylist.script.actionnactionsfile user.action # User customizations

filterfile default.filter filterfile easylist.script.filternfilterfile user.filter # User customizations

Sorry for multiple post.

19 |
Lorenzo
| 2012/04/03 00:32 | reply

Well this made me angry. Thanks. What Lorenzo describes above is correct.

What Lorenzo doesn't perhaps make crystal clear is that because the script is seemingly crappily coded to not work („actionsfile easylist.script.actionnactionsfile“ etc) and it is not clear how to change „config“ TO make it work that you should *steer clear of this stupid thing altogether* or face having to reinstall privoxy all over again because even deleting the silly error changes (i.e. „easylist.script.filternfilterfile“) that this wrong script adds doesn't seem to fix what ever it broke

20 |
Lebowski
| 2012/06/06 04:25 | reply

@Lebowski

The script do work , I could fix the problem by upgrading the dependencies.

I now have bash 4.2.24(2), wget 1.13.4, sed 4.2.1 and the script works correctly.

21 |
Lorenzo
| 2012/06/06 06:38 | reply

./privoxy-blocklist.sh: 33: Syntax error: „(“ unexpected Hmm, weird… This looks correct to me.

Still actual?

Try „bash privoxy-blacklist.sh“ instead of „sh privoxy-blacklist.sh“

22 |
Syntax Error
| 2012/10/19 22:50 | reply

Can you read more about the actions manual from http://www.privoxy.org/user-manual/actions-file.html, and then modify your script to deal with the domain and path correctly.

23 |
westmin
| 2012/11/29 02:18 | reply

About that „./privoxy-blocklist.sh: 33: Syntax error: “(„ unexpected Hmm, weird… This looks correct to me.“

I've encountered that on the ksh of my OpenBSD setup. There are two things to be taken into account:

1. Change the path to bash from /bin/bash to wherever bash is located (/usr/local/bin/bash in my case)

2. Run the script usind bash or the full path to the bash executable.

Boom it works. Even on OpenBSD, which sure isn't Arch Linux :D

24 |
Roman Geber
| 2012/12/27 06:14 | reply

Hello,

I've the same problem. About that “./privoxy-blocklist.sh: 33: Syntax error: “(“ My server is on Debian Squeeze Do you have an idea for this problem ? Cordialy Guillaume

25 |
Guillaume
| 2013/01/10 16:00 | reply

Seems to work for me, no errors what I can tell. However, when moving to privoxy from AdBlock+, the pesky facebook sponsored sidebar ads reappeared, even with the adp+ patterns implemented in privoxy. Anyone know why?

26 |
wrox
| 2013/06/14 08:47 | reply

@wrox: Facebook uses SSL encrypted connections. Thats the reason why Privoxy is not able to remove Ads from an SSL connection (Privoxy is unable by design to decrypt SSL connections). AdBlock can do that because its a Browser Plugin which has access to the unencrypted content.

27 |
amnesius
| 2013/07/06 07:23 | reply

I got this to work just fine on the latest OpenWRT trunk. You have to install bash, wget, and sed. Don't use the wget and sed that's built into busybox, it will fail.

28 |
Matthew M. Dean
| 2013/07/22 12:52 | reply

Hi Andrwe,

Thank you for your script. it worked great for me. But I have a question (or feature-request :-) ) Since Adblock+ has an export-function to export all filterrules, including my own filterrules I „collected“ over the past years: is there a way to convert and add them to privoxy with a modified version of your script?

29 |
Ray
| 2013/09/17 06:41 | reply

Squidblacklist.org is the worlds leading publisher of native acl blacklists tailored specifically for Squid proxy, and alternative formats for all major third party plugins as well as many other filtering platforms. Including SquidGuard, DansGuardian, and ufDBGuard, as well as pfSense and more. Our adult blacklist contains over 1.2 million domains, we have unique blacklists that you will not find any other place.

There is room for better blacklists, we intend to fill that gap.

It would be our pleasure to serve you.

Signed,

Benjamin E. Nichols http://www.squidblacklist.org

30 |
Benjamin E. Nichols
| 2014/11/29 04:00 | reply


Wenn Sie die Buchstaben auf dem Bild nicht lesen können, laden Sie diese .wav Datei herunter, um sie vorgelesen zu bekommen.
scripting/bash/privoxy-blocklist.txt · Zuletzt geändert: 2012/08/30 12:09 von Andrwe Lord Weber

Seiten-Werkzeuge