Kingname | 谢乾坤

One skill a day: The latest version of Scrapy is not compatible with scrapy_redis

Original link: https://www.kingname.info/2023/08/20/scrapy-redis-no-crawler/ Many students use Scrapy + scrapy_redis to implement distributed crawlers when writing crawlers. However, scrapy_redis has been updated less and less in recent years, and it feels quite old. With many updates of Scrapy, scrapy_redis can no longer keep up. When you install Scrapy, if you do not specify a specific version, …

One skill a day: The latest version of Scrapy is not compatible with scrapy_redis Read More »

One skill a day: instead of taking the conventional route, the list page can be completed in 1 second

Original link: https://www.kingname.info/2023/07/19/crawl-by-sitemap/ I recently encountered a need to grab all the documents on Docusaurus . As shown below: It is very simple to capture the text of the document. Using GNE Advanced Edition, as long as there is a URL, it can be directly captured, as shown in the following figure: But the question …

One skill a day: instead of taking the conventional route, the list page can be completed in 1 second Read More »

One skill a day: the wrong method doubles the code. How do Requests retry correctly?

Original link: https://www.kingname.info/2023/06/11/retry-in-requests/ Programmers are a group that needs continuous learning. If you find that the code you write now is no different from the code you wrote 5 years ago, it means you have fallen behind. When we do Python development, we often use some third-party libraries, which have continued to add new functions …

One skill a day: the wrong method doubles the code. How do Requests retry correctly? Read More »

One skill a day: Prompt reverse engineering, cracking the copy generator of Xiaohongshu

Original link: https://www.kingname.info/2023/05/17/prompt-reverse-engineer/ Many students who follow my official account can write reptiles. But if you want to write crawlers well, you must master some reverse technology to reverse JavaScript and Android App on web pages, so as to break through signatures or bypass anti-crawler restrictions. In the past six months, large language models have …

One skill a day: Prompt reverse engineering, cracking the copy generator of Xiaohongshu Read More »

One skill a day: the execution order of Python decorators

Original link: https://www.kingname.info/2023/04/16/order-of-decorator/ When it comes to the execution order of Python decorators, there are a lot of half-baked ones: The decorators near the function name are executed first, and the decorators far away from the function name are executed after. This statement is not accurate. But most of these half-assed people will still be …

One skill a day: the execution order of Python decorators Read More »

Whisper softly, Whisper, the voice-to-text model hidden in the light

Original link: https://www.kingname.info/2023/04/15/whisper/ On the day when ChatGPT’s model gpt-3.5-turbo was released, OpenAI also open sourced a speech-to-text model: Whisper. But because ChatGPT itself is too dazzling, many people ignore the existence of Whisper. I was also like this at the time. I once thought that Whisper is also an API, which needs to send …

Whisper softly, Whisper, the voice-to-text model hidden in the light Read More »

Does it make sense to count the bug rate of thousands of lines of code?

Original link: https://www.kingname.info/2022/07/13/bug-rate/ My conclusion: Statistical bug rates make sense. But it is meaningless to count the bug rate of thousands of lines of code. Why is the 1000 lines of code bug rate meaningless? A company recently came up with a scheme to quantify the job performance of programmers. It is called千行代码Bug率. In a …

Does it make sense to count the bug rate of thousands of lines of code? Read More »

One skill a day: Convert the time described in natural language into a standard format

Original link: https://www.kingname.info/2022/07/13/nlp-datetime/ If you’ve used Tick List or Todoist, you should know that they have a very useful feature, which is to automatically recognize the time in tasks, such as: 1 Email the boss next Tuesday at 3pm It is automatically recognized as: Today, in the official account fan group, a classmate named NowAnti …

One skill a day: Convert the time described in natural language into a standard format Read More »

One skill a day: binary bias to the left, is binary search also useful in distributed systems?

Original link: https://www.kingname.info/2022/06/22/bisect-left/ I believe everyone knows binary search. In an ordered list, using binary search can quickly determine whether the target is in the list with O(logN) time complexity. The code for binary search is very simple, and it only takes a few lines of code to use recursion: 1 2 3 4 5 …

One skill a day: binary bias to the left, is binary search also useful in distributed systems? Read More »

One Skill a Day: Translating Text Strings in HTML Using Python

Original link: https://www.kingname.info/2022/06/20/translate-html/ I believe that everyone has used the browser’s translation webpage function, for example, for the English webpage shown below: After one-click translation into Chinese, it looks like this: You may think this function is very simple, isn’t it just string replacement? Then you can try to translate the English below the <p> …

One Skill a Day: Translating Text Strings in HTML Using Python Read More »

One skill a day: Bug analysis, the problem that false deletion leads to successful article publishing but cannot be opened

Original link: https://www.kingname.info/2022/06/20/fake-delete/ The company has an internal blog, where you can create your own account and write articles to share throughout the company. Yesterday this internal blog opened the API, so I am going to write a Python program and upload all the articles on my official account. Then I found out that there …

One skill a day: Bug analysis, the problem that false deletion leads to successful article publishing but cannot be opened Read More »