update README. Remove list size TODO. Add link to permute_wordlist.
This commit is contained in:
parent
3a3a3f1db1
commit
3a03dde3c0
|
@ -3,8 +3,10 @@
|
||||||
The 'mod_pottymouth' ejabberd module aims to fill the void left by 'mod_shit'
|
The 'mod_pottymouth' ejabberd module aims to fill the void left by 'mod_shit'
|
||||||
which has disappeared from the net. It allows individual whole words of a
|
which has disappeared from the net. It allows individual whole words of a
|
||||||
message to be filtered against a blacklist. It allows multiple blacklists
|
message to be filtered against a blacklist. It allows multiple blacklists
|
||||||
sharded by language. To make use of this module the client must add the xml:lang
|
sharded by language. The internal bloomfilter can support arbitrary blacklist
|
||||||
attribute to the message xml.
|
sizes. Using a large list (say, 87M terms) will slow down the initial server
|
||||||
|
boot time (to about 15 minutes respectively), but once loaded lookups are very
|
||||||
|
speedy.
|
||||||
|
|
||||||
#### Installation
|
#### Installation
|
||||||
|
|
||||||
|
@ -47,13 +49,12 @@ the 'default' entry in config will be used.
|
||||||
For xml:lang attribute docs, see:
|
For xml:lang attribute docs, see:
|
||||||
[http://wiki.xmpp.org/web/Programming_XMPP_Clients#Sending_a_message](http://wiki.xmpp.org/web/Programming_XMPP_Clients#Sending_a_message)
|
[http://wiki.xmpp.org/web/Programming_XMPP_Clients#Sending_a_message](http://wiki.xmpp.org/web/Programming_XMPP_Clients#Sending_a_message)
|
||||||
|
|
||||||
The internal bloomfilter used to ingest the blacklists currently requires about
|
#### Blacklist helper
|
||||||
4,000 entries in the blacklist to ensure acceptable error probability. (We've
|
|
||||||
gotten around this by duplicating entries in a short list)
|
|
||||||
|
|
||||||
#### Todo
|
Thinking of a bunch of swear words and all the permutations can be tough. We made
|
||||||
|
a helper script to take a bare wordlist and generate permutations given a
|
||||||
Look into acceptable error probabilities for shorter blacklists.
|
dictionary of substitution characters:
|
||||||
|
[https://github.com/madglory/permute_wordlist](https://github.com/madglory/permute_wordlist)
|
||||||
|
|
||||||
#### Tip of the hat
|
#### Tip of the hat
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue