# 在 Halo 中实践全文搜索
主题端需全文搜索接口用于模糊搜索文章,且对效率要求极高。已经有对应的 Issue
提出,可参考:<https://github.com/halo-dev/halo/issues/2637>。
实现全文搜索的本地方案最好的就是 Apache 旗下开源的 [Lucene](https://lucene.apache.org/)
,不过 [Hibernate Search](https://hibernate.org/search/) 也基于 Lucene 实现了全文搜索。Halo 2.0 的自定义模型并不是直接在
Hibernate 上构建的,也就是说 Hibernate 在 Halo 2.0 只是一个可选项,故我们最终可能并不会采用 Hibernate Search,即使它有很多优势。
Halo 也可以学习 Hibernate 适配多种搜索引擎,如 Lucene、ElasticSearch、MeiliSearch 等。默认实现为 Lucene,对于用户来说,这种实现方式部署成本最低。
## 搜索接口设计
### 搜索参数
字段如下所示:
- keyword: string. 关键字
- sort: string[]. 搜索字段和排序方式
- offset: number. 本次查询结果偏移数
- limit: number. 本次查询的结果最大条数
例如:
```bash
http://localhost:8090/apis/api.halo.run/v1alpha1/posts?keyword=halo&sort=title.asc&sort=publishTimestamp,desc&offset=20&limit=10
```
### 搜索结果
```yaml
hits:
- name: halo01
title: Halo 01
permalink: /posts/halo01
categories:
- a
- b
tags:
- c
- d
- name: halo02
title: Halo 02
permalink: /posts/halo02
categories:
- a
- b
tags:
- c
- d
query: "halo"
total: 100
limit: 20
offset: 10
processingTimeMills: 2
```
#### 搜索结果分页问题
目前,大多数搜索引擎为了性能问题,并没有直接提供分页功能,或者不推荐分页。
请参考:
- <https://solr.apache.org/guide/solr/latest/query-guide/pagination-of-results.html>
- <https://docs.meilisearch.com/learn/advanced/pagination.html>
- <https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html>
- <https://discourse.algolia.com/t/pagination-limit/10585>
综合以上讨论,我们暂定不支持分页。不过允许设置单次查询的记录数(limit <= max_limit)。
#### 中文搜索优化
Lucene 默认的分析器,对中文的分词不够友好,我们需要借助外部依赖或者外部整理好的词库帮助我们更好的对中文句子分词,以便优化中文搜索结果。
以下是关于中文分析器的 Java 库:
- <https://gitee.com/lionsoul/jcseg>
- <https://code.google.com/archive/p/ik-analyzer>
- <https://github.com/huaban/jieba-analysis>
- <https://github.com/medcl/elasticsearch-analysis-ik>
- <https://github.com/blueshen/ik-analyzer>
### 搜索引擎样例
#### MeiliSearch
```bash
curl 'http://localhost:7700/indexes/movies/search' \
-H 'Accept: */*' \
-H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5' \
-H 'Authorization: Bearer MASTER_KEY' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Cookie: logged_in=yes; adminer_permanent=; XSRF-TOKEN=75995791-980a-4f3e-81fb-2e199d8f3934' \
-H 'Origin: http://localhost:7700' \
-H 'Referer: http://localhost:7700/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: same-origin' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26' \
-H 'X-Meilisearch-Client: Meilisearch mini-dashboard (v0.2.2) ; Meilisearch instant-meilisearch (v0.8.2) ; Meilisearch JavaScript (v0.27.0)' \
-H 'sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Windows"' \
--data-raw '{"q":"halo","attributesToHighlight":["*"],"highlightPreTag":"<ais-highlight-0000000000>","highlightPostTag":"</ais-highlight-0000000000>","limit":21}' \
--compressed
```
```json
{
"hits": [
{
"id": 108761,
"title": "I Am... Yours: An Intimate Performance at Wynn Las Vegas",
"overview": "Filmed at the Encore Theater at Wynn Las Vegas, this extraordinary concert features performances of over 30 songs from Beyoncé’s three multi-platinum solo releases, Destiny’s Child catalog and a few surprises. This amazing concert includes the #1 hits, “Single Ladies (Put A Ring On It),” “If I Were A Boy,” “Halo,” “Sweet Dreams” and showcases a gut-wrenching performance of “That’s Why You’re Beautiful.” Included on \"I AM... YOURS An Intimate Performance At Wynn Las Vegas,\" is a biographical storytelling woven between many songs and exclusive behind-the-scenes footage.",
"genres": ["Music", "Documentary"],
"poster": "https://image.tmdb.org/t/p/w500/j8n1XQNfw874Ka7SS3HQLCVNBxb.jpg",
"release_date": 1258934400,
"_formatted": {
"id": "108761",
"title": "I Am... Yours: An Intimate Performance at Wynn Las Vegas",
"overview": "Filmed at the Encore Theater at Wynn Las Vegas, this extraordinary concert features performances of over 30 songs from Beyoncé’s three multi-platinum solo releases, Destiny’s Child catalog and a few surprises. This amazing concert includes the #1 hits, “Single Ladies (Put A Ring On It),” “If I Were A Boy,” “<ais-highlight-0000000000>Halo</ais-highlight-0000000000>,” “Sweet Dreams” and showcases a gut-wrenching performance of “That’s Why You’re Beautiful.” Included on \"I AM... YOURS An Intimate Performance At Wynn Las Vegas,\" is a biographical storytelling woven between many songs and exclusive behind-the-scenes footage.",
"genres": ["Music", "Documentary"],
"poster": "https://image.tmdb.org/t/p/w500/j8n1XQNfw874Ka7SS3HQLCVNBxb.jpg",
"release_date": "1258934400"
}
}
],
"estimatedTotalHits": 10,
"query": "halo",
"limit": 21,
"offset": 0,
"processingTimeMs": 2
}
```
![MeiliSearch UI](./meilisearch.jpg)
#### Algolia
```bash
curl 'https://og53ly1oqh-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.14.2)%3B%20Browser%20(lite)%3B%20docsearch%20(3.2.1)%3B%20docsearch-react%20(3.2.1)%3B%20docusaurus%20(2.1.0)&x-algolia-api-key=739f2a55c6d13d93af146c22a4885669&x-algolia-application-id=OG53LY1OQH' \
-H 'Accept: */*' \
-H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5' \
-H 'Connection: keep-alive' \
-H 'Origin: https://docs.halo.run' \
-H 'Referer: https://docs.halo.run/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: cross-site' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26' \
-H 'content-type: application/x-www-form-urlencoded' \
-H 'sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Windows"' \
--data-raw '{"requests":[{"query":"halo","indexName":"docs","params":"attributesToRetrieve=%5B%22hierarchy.lvl0%22%2C%22hierarchy.lvl1%22%2C%22hierarchy.lvl2%22%2C%22hierarchy.lvl3%22%2C%22hierarchy.lvl4%22%2C%22hierarchy.lvl5%22%2C%22hierarchy.lvl6%22%2C%22content%22%2C%22type%22%2C%22url%22%5D&attributesToSnippet=%5B%22hierarchy.lvl1%3A5%22%2C%22hierarchy.lvl2%3A5%22%2C%22hierarchy.lvl3%3A5%22%2C%22hierarchy.lvl4%3A5%22%2C%22hierarchy.lvl5%3A5%22%2C%22hierarchy.lvl6%3A5%22%2C%22content%3A5%22%5D&snippetEllipsisText=%E2%80%A6&highlightPreTag=%3Cmark%3E&highlightPostTag=%3C%2Fmark%3E&hitsPerPage=20&facetFilters=%5B%22language%3Azh-Hans%22%2C%5B%22docusaurus_tag%3Adefault%22%2C%22docusaurus_tag%3Adocs-default-1.6%22%5D%5D"}]}' \
--compressed
```
```json
{
"results": [
{
"hits": [
{
"content": null,
"hierarc
没有合适的资源?快使用搜索试试~ 我知道了~
Halo 作为一款好用又强大的开源建站工具,配合上不同的模板与插件,可以很好地帮助你构建你心中的理想站点 .rar
共1442个文件
java:683个
ts:399个
vue:156个
需积分: 5 1 下载量 182 浏览量
2023-06-30
23:39:14
上传
评论
收藏 2.17MB RAR 举报
温馨提示
早在很久之前,咕咕就发布过Halo博客的视频,当时是因为Hexo没有后台,编辑起来有点麻烦,在众多博客框架里面,偶然找到的Halo。 比起Hexo等静态博客,Halo自带后台和评论功能,非常省心;比起Wordpress,Halo搭建简单,Docker搭建,一行命令搞定,迁移更新也更方便。对于对主题要求没那么高,只是想要搭建一个博客好好写文章的小伙伴来说,Halo绝对是一个不二的选择。 更多阅读:Hexo还是Hugo?Typecho还是Wordpress?读完这篇或许你就有答案了! 前不久,Halo官方迎来了大版本的更新,咕咕也在动态里发过,但是由于当时的bug还比较多,仙总还有其他的大佬们还在忙着改代码,而且halo2.0的生态需要重新建立,不少1.0里面优秀的主题还未移植适配2.0,考虑到使用体验,咕咕就暂时没有专门发视频来和大家分享。(咕咕的博客到现在用的还是Halo 1.x版本) 随着开发者们的不懈努力,Halo 在昨天晚上已经更新到了2.3.1,咕咕有个测试站点也跟着升级了一下,一番使用之后,发现现在是时候和大家分享一下全新的Halo 2.0了
资源推荐
资源详情
资源评论
收起资源包目录
Halo 作为一款好用又强大的开源建站工具,配合上不同的模板与插件,可以很好地帮助你构建你心中的理想站点 .rar (1442个子文件)
gradlew.bat 3KB
.eslintrc.cjs 595B
.eslintrc.cjs 58B
.eslintrc.cjs 58B
index.css 273B
tailwind.css 59B
tailwind.css 59B
style.css 53B
.env.development 38B
Dockerfile 888B
.dockerignore 20B
.editorconfig 21KB
.editorconfig 229B
.eslintignore 22B
examplefile 25B
FILES 6KB
.gitattributes 121B
.gitignore 954B
.gitignore 909B
.gitignore 328B
.gitignore 39B
.gitkeep 0B
build.gradle 2KB
build.gradle 2KB
halo.publish.gradle 2KB
build.gradle 2KB
build.gradle 487B
settings.gradle 231B
build.gradle 42B
gradlew 8KB
说明.htm 4KB
error.html 3KB
index.html 1KB
index.html 201B
index.html 201B
index.html 170B
timezone.html 72B
timezone.html 72B
index.html 14B
favicon.ico 15KB
test-plugin-components.idx 46B
gradle-wrapper.jar 60KB
fake-plugin.jar 698B
PluginReconciler.java 32KB
PluginEndpoint.java 28KB
ThemeEndpoint.java 24KB
PluginReconcilerTest.java 23KB
UserEndpoint.java 22KB
CommentPublicQueryServiceImplTest.java 22KB
ReactiveExtensionClientTest.java 21KB
RequestInfoResolverTest.java 20KB
UserEndpointTest.java 19KB
UserServiceImplTest.java 19KB
CommentServiceImplTest.java 18KB
ThemeServiceImplTest.java 17KB
HaloPluginManager.java 17KB
HaloProcessorDialectTest.java 16KB
ThemeServiceImpl.java 15KB
DefaultRoleServiceTest.java 15KB
SinglePageReconciler.java 15KB
CategoryFinderImplTest.java 14KB
PluginEndpointTest.java 14KB
CommentQueryTest.java 14KB
CommentFinderEndpoint.java 14KB
ThemeReconcilerTest.java 14KB
ThemeEndpointTest.java 13KB
SystemSettingReconciler.java 13KB
PostEndpoint.java 13KB
PostReconciler.java 13KB
DefaultIndexerTest.java 12KB
AttachmentEndpoint.java 12KB
CommentPublicQueryServiceImpl.java 11KB
PostFinderImpl.java 11KB
AttachmentEndpointTest.java 11KB
PostServiceImpl.java 11KB
SinglePageEndpoint.java 11KB
PluginRequestMappingHandlerMappingTest.java 11KB
PluginAutoConfiguration.java 10KB
LocalAttachmentUploadHandler.java 10KB
SinglePageServiceImpl.java 10KB
DefaultControllerTest.java 9KB
ThemeMessageResolutionUtils.java 9KB
ExtensionConfigurationTest.java 9KB
PostFinderImplTest.java 9KB
CommentQuery.java 9KB
PostRouteFactory.java 9KB
SpringExtensionFactory.java 9KB
Unstructured.java 9KB
ThemeReconciler.java 9KB
ThemeUtils.java 9KB
DefaultController.java 8KB
PluginServiceImpl.java 8KB
AuthorizationTest.java 8KB
LucenePostSearchService.java 8KB
PluginServiceImplTest.java 8KB
MenuItemReconcilerTest.java 8KB
ReactiveExtensionClientImpl.java 8KB
CommentReconciler.java 8KB
RoleBindingReconcilerTest.java 8KB
RequestInfoFactory.java 8KB
共 1442 条
- 1
- 2
- 3
- 4
- 5
- 6
- 15
资源评论
野生的狒狒
- 粉丝: 1639
- 资源: 1667
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 最全空间计量实证方法(空间杜宾模型和检验以及结果解释文档).txt
- 5uonly.apk
- 蓝桥杯Python组的历年真题
- 2023-04-06-项目笔记 - 第一百十九阶段 - 4.4.2.117全局变量的作用域-117 -2024.04.30
- 2023-04-06-项目笔记 - 第一百十九阶段 - 4.4.2.117全局变量的作用域-117 -2024.04.30
- 前端开发技术实验报告:内含4四实验&实验报告
- Highlight Plus v20.0.1
- 林周瑜-论文.docx
- 基于MIC+NE555光敏电阻的声光控电路Multisim仿真原理图
- 基于JSP毕业设计-基于WEB操作系统课程教学网站的设计与实现(源代码+论文).zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功