Python utility module (34) better-profanity

surroundings

  • python 3.8
  • better_profanity 0.6.1

foreword

This article introduces a sensitive word filtering tool, better-profanity , which is based on the profanity developed by Ben Friedland . On this basis, the original regular-based method has been changed to the current string comparison, and the speed has been improved by no means. At the same time, it supports some spelling modifications, such as b*tch , p0rn , etc., but unfortunately, this library does not support Chinese at present.

Install

Install using pip , execute the command

 pip install better_profanity==0.6.1

The version 0.6.1 is installed here, not the latest 0.7.0. The reason is that there are performance problems in the new version. For details, please go to https://github.com/snguyenthanh/better_profanity/issues/19

Example

Let’s look at the simplest example

 from better_profanity import profanity if __name__ == "__main__": profanity.load_censor_words() text = "what a fcuk." censored_text = profanity.censor(text) print(censored_text)

After the code is executed, the output

 what a ****.

load_censor_words will import a text file profanity_wordlist.txt , which is the default sensitive word. The module will use a specific algorithm (you can read the source code better_profanity.py ) based on these basic sensitive words to derive some common variant writing, so in In actual use, this algorithm also needs to be updated all the time

After a sensitive word is found, the target string will be replaced by 4 * signs by default. Of course, this can also be changed

 from better_profanity import profanity if __name__ == "__main__": text = "You p1ec3 of sHit." censored_text = profanity.censor(text, '-') print(censored_text)

In this case, use - instead of *

If you maintain a sensitive word file yourself, it is also supported, and the usage method is also very simple

 from better_profanity import profanity if __name__ == "__main__": profanity.load_censor_words_from_file('/path/to/my/project/my_wordlist.txt')

If you want to temporarily remove certain words from sensitive words, you can use the whitelist mechanism

 from better_profanity import profanity if __name__ == "__main__": profanity.load_censor_words_from_file('/path/to/my/project/my_wordlist.txt', whitelist_words=['merry'])

For more details, please refer to the official documentation https://pypi.org/project/better-profanity/

Topics in Python Practical Modules

More useful python modules, please move

https://xugaoxiang.com/category/python/modules/

This article is reprinted from https://xugaoxiang.com/2022/08/19/python-module-34-better-profanity/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment