surroundings
- python 3.8
- better_profanity 0.6.1
foreword
This article introduces a sensitive word filtering tool, better-profanity
, which is based on the profanity developed by Ben Friedland . On this basis, the original regular-based method has been changed to the current string comparison, and the speed has been improved by no means. At the same time, it supports some spelling modifications, such as b*tch
, p0rn
, etc., but unfortunately, this library does not support Chinese at present.
Install
Install using pip
, execute the command
pip install better_profanity==0.6.1
The version 0.6.1 is installed here, not the latest 0.7.0. The reason is that there are performance problems in the new version. For details, please go to https://github.com/snguyenthanh/better_profanity/issues/19
Example
Let’s look at the simplest example
from better_profanity import profanity if __name__ == "__main__": profanity.load_censor_words() text = "what a fcuk." censored_text = profanity.censor(text) print(censored_text)
After the code is executed, the output
what a ****.
load_censor_words
will import a text file profanity_wordlist.txt
, which is the default sensitive word. The module will use a specific algorithm (you can read the source code better_profanity.py
) based on these basic sensitive words to derive some common variant writing, so in In actual use, this algorithm also needs to be updated all the time
After a sensitive word is found, the target string will be replaced by 4 *
signs by default. Of course, this can also be changed
from better_profanity import profanity if __name__ == "__main__": text = "You p1ec3 of sHit." censored_text = profanity.censor(text, '-') print(censored_text)
In this case, use -
instead of *
If you maintain a sensitive word file yourself, it is also supported, and the usage method is also very simple
from better_profanity import profanity if __name__ == "__main__": profanity.load_censor_words_from_file('/path/to/my/project/my_wordlist.txt')
If you want to temporarily remove certain words from sensitive words, you can use the whitelist mechanism
from better_profanity import profanity if __name__ == "__main__": profanity.load_censor_words_from_file('/path/to/my/project/my_wordlist.txt', whitelist_words=['merry'])
For more details, please refer to the official documentation https://pypi.org/project/better-profanity/
Topics in Python Practical Modules
More useful python
modules, please move
https://xugaoxiang.com/category/python/modules/
This article is reprinted from https://xugaoxiang.com/2022/08/19/python-module-34-better-profanity/
This site is for inclusion only, and the copyright belongs to the original author.