Gossip site’s Google indexing issues

Original link: https://blog.rxliuli.com/p/cb752f65aeee4661bf4f31fa7d2a2729/

Scenes

Since February this year, I have created a GitHub project for the translation of Magical Girl Madoka-Flying to the Stars fan fiction, and it has been maintained until now. Initially, this project just packaged the translations of the current authors into epubs for reading on mobile phones and e-readers, while avoiding the creation of archives by domestic websites being deleted, delisted, or no longer maintained. Later, tools were also used to generate websites from markdown.

Magical Girl Madoka-Flying to the Starry Sky Introduction
cover
After centuries of turbulence, a utopian AI-human government rules the planet, heralding the dawn of a post-scarcity society and a new era of space colonization. An unexpected encounter causes a more technologically advanced hostile alien race to break the peace, forcing the magical girls to step out of the background and save human civilization. In the midst of all this, Ryoko Shizuki, an ordinary girl, looks up at the stars, curious about her place in the universe.

“Kewpie promises that human beings will one day reach that distant starry sky. But they are wise not to say what human beings will encounter there.” – Introduction

problem found

Previously, the site was built with vuepress because it looked pretty friendly and simple, and I’ve used it elsewhere before. But recently, my generation tried to add a search function to the website, and learned about the Google Custom Search Engine in the research, but found that it could not search for effective content in the actual test. Further investigation revealed that Google was not indexing the site properly, so it was not able to search properly.

index

I also observed the content generated by vuepress, it will generate index.html and 404.html under dist/, and after deploying with github pages, if you access a non-existing path, it will be automatically navigated to 404.html, and then Rendering specific page content through vue-router (essentially a spa website), which resulted in a Google search reporting problem not found (404) , which seems to be ignored in vuepress.

 ├── 404.html├── assets├── CNAME├── index.html├── local-search.json├── logo.png├── logoDark.png└── sitemap.xml 

solve

So, as a last resort, I tried to find an alternative to vuepress, and finally used docusaurus . It has the same basic goal as vuepress, but uses a different technology stack. I explained some of their differences in issue feat: Consider migrating to docusaurus , but the most important thing is that the bundle it generates points to the actual .html file , which makes the Google index happy.

1665132275042.png

At the same time, I also submitted the sitemap to bing, which allows domestic users to search.

google
bing

Epilogue

It’s just that existing search engines have a problem, and they don’t work well for a lot of content that doesn’t use title segmentation. For example, on a novel website, a large amount of content is separated by <p></p> tags. This is a really annoying problem, but I’ll save it for another time.

In fact, I tried to solve it in vuepress, ref: https://github.com/liuli-moe/to-the-stars/issues/22#issuecomment-1253240061

This article is reproduced from: https://blog.rxliuli.com/p/cb752f65aeee4661bf4f31fa7d2a2729/
This site is for inclusion only, and the copyright belongs to the original author.