update README.txt with charmap description. remove README.md.
This commit is contained in:
		
							parent
							
								
									265ff3dc70
								
							
						
					
					
						commit
						610c0e72eb
					
				
							
								
								
									
										66
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										66
									
								
								README.md
									
									
									
									
									
								
							@ -1,66 +0,0 @@
 | 
			
		||||
ejabberd-contrib
 | 
			
		||||
================
 | 
			
		||||
 | 
			
		||||
This is a collaborative development area for ejabberd module developers
 | 
			
		||||
and users.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
For users
 | 
			
		||||
---------
 | 
			
		||||
 | 
			
		||||
To use an ejabberd module coming from this repository:
 | 
			
		||||
 | 
			
		||||
- You need to have ejabberd installed.
 | 
			
		||||
 
 | 
			
		||||
- If you have not already done it, run `ejabberdctl modules_update_specs`
 | 
			
		||||
  to retrieve the list of available modules.
 | 
			
		||||
 | 
			
		||||
- Run `ejabberdctl module_install <module>` to get the source code and to
 | 
			
		||||
  compile and install the `beam` file into ejabberd's module search path.
 | 
			
		||||
  This path is either `~/.ejabberd-modules` or defined by the
 | 
			
		||||
  `CONTRIB_MODULES_PATH` setting in `ejabberdctl.cfg`.
 | 
			
		||||
 | 
			
		||||
- Edit the configuration file provided in the `conf` directory of the
 | 
			
		||||
  installed module and update it to your needs. Then apply the changes to
 | 
			
		||||
  your main ejabberd configuration. In a future release, ejabberd will
 | 
			
		||||
  automatically add this file to its runtime configuration without
 | 
			
		||||
  changes.
 | 
			
		||||
 | 
			
		||||
- Run `ejabberdctl module_uninstall <module>` to remove a module from
 | 
			
		||||
  ejabberd.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
For developers
 | 
			
		||||
--------------
 | 
			
		||||
 | 
			
		||||
The following organization has been set up for the development:
 | 
			
		||||
 | 
			
		||||
- Development and compilation of modules is done by ejabberd. You need
 | 
			
		||||
  ejabberd installed. Use `ejabberdctl module_check <module>` to ensure it
 | 
			
		||||
  compiles correctly before committing your work. The sources of your
 | 
			
		||||
  module must be located in `$CONTRIB_MODULES_PATH/sources/<module>`.
 | 
			
		||||
 | 
			
		||||
- Compilation can by done manually (if you know what you are doing) so you
 | 
			
		||||
  don't need ejabberd running:
 | 
			
		||||
  ```
 | 
			
		||||
  cd /path/of/module
 | 
			
		||||
  mkdir ebin
 | 
			
		||||
  /path/of/ejabberd's/erlc \
 | 
			
		||||
     -o ebin \
 | 
			
		||||
     -I include -I /path/of/ejabberd/lib/ejabberd-XX.YY/include \
 | 
			
		||||
     -DLAGER -DNO_EXT_LIB \
 | 
			
		||||
     src/*erl
 | 
			
		||||
  ```
 | 
			
		||||
 | 
			
		||||
- The module directory structure is usually the following:
 | 
			
		||||
    * `README.txt`: Module description.
 | 
			
		||||
    * `COPYING`: License for the module.
 | 
			
		||||
    * `doc/`: Documentation directory.
 | 
			
		||||
    * `src/`: Erlang source directory.
 | 
			
		||||
    * `lib/`: Elixir source directory.
 | 
			
		||||
    * `priv/msgs/`: Directory with translation files (pot, po and msg).
 | 
			
		||||
    * `conf/<module>.yml`: Configuration for your module.
 | 
			
		||||
    * `<module>.spec`: Yaml description file for your module.
 | 
			
		||||
 | 
			
		||||
- Module developers should note in the `README.txt` file whether the
 | 
			
		||||
  module has requirements or known incompatibilities with other modules.
 | 
			
		||||
@ -1,8 +1,10 @@
 | 
			
		||||
The 'mod_pottymouth' ejabberd module aims to fill the void left by 'mod_shit'
 | 
			
		||||
which has disappeared from the net. It allows individual whole words of a
 | 
			
		||||
message to be filtered against a blacklist. It allows multiple blacklists
 | 
			
		||||
sharded by language. To make use of this module the client must add the xml:lang
 | 
			
		||||
attribute to the message xml.
 | 
			
		||||
sharded by language. The internal bloomfilter can support arbitrary blacklist
 | 
			
		||||
sizes. Using a large list (say, 87M terms) will slow down the initial server
 | 
			
		||||
boot time (to about 15 minutes respectively), but once loaded lookups are very
 | 
			
		||||
speedy.
 | 
			
		||||
 | 
			
		||||
To install in ejabberd:
 | 
			
		||||
 | 
			
		||||
@ -25,11 +27,31 @@ modules:
 | 
			
		||||
            en: /home/your_user/blacklist_en.txt
 | 
			
		||||
            cn: /home/your_user/blacklist_cn.txt
 | 
			
		||||
            fr: /home/your_user/blacklist_fr.txt
 | 
			
		||||
        charmaps:
 | 
			
		||||
            default: /etc/ejabberd/modules/mod_pottymouth/charmap_en.txt
 | 
			
		||||
            en: /etc/ejabberd/modules/mod_pottymouth/charmap_en.txt
 | 
			
		||||
 | 
			
		||||
For each language (en,cn,fr,...whatever) provide a full path to a backlist file.
 | 
			
		||||
The blacklist file is a plain text file with blacklisted words listed one per
 | 
			
		||||
line.
 | 
			
		||||
 | 
			
		||||
You can also provide an optional 'charmap' for each language. This allows you
 | 
			
		||||
to specify simple substitutions that will be made on the fly so you don't need
 | 
			
		||||
to include those permutations in the blacklist. This keeps the blacklist small
 | 
			
		||||
and reduces server startup time. For example, if you included the word:
 | 
			
		||||
'xyza' in the blacklist, adding the following substitutions in the charmap
 | 
			
		||||
would filter permutations such as 'XYZA', 'xYz4', or 'Xyz@' automatically.
 | 
			
		||||
 | 
			
		||||
charmap format:
 | 
			
		||||
 | 
			
		||||
[
 | 
			
		||||
 {"X", "x"},
 | 
			
		||||
 {"Y", "y"},
 | 
			
		||||
 {"Z", "z"},
 | 
			
		||||
 {"@", "a"},
 | 
			
		||||
 {"4", "a"}
 | 
			
		||||
].
 | 
			
		||||
 | 
			
		||||
Gotchas:
 | 
			
		||||
 | 
			
		||||
The language will be looked up by whatever value is passed in the xml:lang
 | 
			
		||||
@ -40,13 +62,11 @@ the 'default' entry in config will be used.
 | 
			
		||||
For xml:lang attribute docs, see:
 | 
			
		||||
http://wiki.xmpp.org/web/Programming_XMPP_Clients#Sending_a_message
 | 
			
		||||
 | 
			
		||||
The internal bloomfilter used to ingest the blacklists currently requires about
 | 
			
		||||
4,000 entries in the blacklist to ensure acceptable error probability. (We've
 | 
			
		||||
gotten around this by duplicating entries in a short list)
 | 
			
		||||
Blacklist helper
 | 
			
		||||
 | 
			
		||||
Todo:
 | 
			
		||||
 | 
			
		||||
Look into acceptable error probabilities for shorter blacklists.
 | 
			
		||||
Thinking of a bunch of swear words and all the permutations can be tough. We made
 | 
			
		||||
a helper script to take a bare wordlist and generate permutations given a
 | 
			
		||||
dictionary of substitution characters: https://github.com/madglory/permute_wordlist
 | 
			
		||||
 | 
			
		||||
Tip of the hat:
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user