# 在 Halo 中实践全文搜索
主题端需全文搜索接口用于模糊搜索文章,且对效率要求极高。已经有对应的 Issue
提出,可参考:<https://github.com/halo-dev/halo/issues/2637>。
实现全文搜索的本地方案最好的就是 Apache 旗下开源的 [Lucene](https://lucene.apache.org/)
,不过 [Hibernate Search](https://hibernate.org/search/) 也基于 Lucene 实现了全文搜索。Halo 2.0 的自定义模型并不是直接在
Hibernate 上构建的,也就是说 Hibernate 在 Halo 2.0 只是一个可选项,故我们最终可能并不会采用 Hibernate Search,即使它有很多优势。
Halo 也可以学习 Hibernate 适配多种搜索引擎,如 Lucene、ElasticSearch、MeiliSearch 等。默认实现为 Lucene,对于用户来说,这种实现方式部署成本最低。
## 搜索接口设计
### 搜索参数
字段如下所示:
- keyword: string. 关键字
- sort: string[]. 搜索字段和排序方式
- offset: number. 本次查询结果偏移数
- limit: number. 本次查询的结果最大条数
例如:
```bash
http://localhost:8090/apis/api.halo.run/v1alpha1/posts?keyword=halo&sort=title.asc&sort=publishTimestamp,desc&offset=20&limit=10
```
### 搜索结果
```yaml
hits:
- name: halo01
title: Halo 01
permalink: /posts/halo01
categories:
- a
- b
tags:
- c
- d
- name: halo02
title: Halo 02
permalink: /posts/halo02
categories:
- a
- b
tags:
- c
- d
query: "halo"
total: 100
limit: 20
offset: 10
processingTimeMills: 2
```
#### 搜索结果分页问题
目前,大多数搜索引擎为了性能问题,并没有直接提供分页功能,或者不推荐分页。
请参考:
- <https://solr.apache.org/guide/solr/latest/query-guide/pagination-of-results.html>
- <https://docs.meilisearch.com/learn/advanced/pagination.html>
- <https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html>
- <https://discourse.algolia.com/t/pagination-limit/10585>
综合以上讨论,我们暂定不支持分页。不过允许设置单次查询的记录数(limit <= max_limit)。
#### 中文搜索优化
Lucene 默认的分析器,对中文的分词不够友好,我们需要借助外部依赖或者外部整理好的词库帮助我们更好的对中文句子分词,以便优化中文搜索结果。
以下是关于中文分析器的 Java 库:
- <https://gitee.com/lionsoul/jcseg>
- <https://code.google.com/archive/p/ik-analyzer>
- <https://github.com/huaban/jieba-analysis>
- <https://github.com/medcl/elasticsearch-analysis-ik>
- <https://github.com/blueshen/ik-analyzer>
### 搜索引擎样例
#### MeiliSearch
```bash
curl 'http://localhost:7700/indexes/movies/search' \
-H 'Accept: */*' \
-H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5' \
-H 'Authorization: Bearer MASTER_KEY' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Cookie: logged_in=yes; adminer_permanent=; XSRF-TOKEN=75995791-980a-4f3e-81fb-2e199d8f3934' \
-H 'Origin: http://localhost:7700' \
-H 'Referer: http://localhost:7700/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: same-origin' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26' \
-H 'X-Meilisearch-Client: Meilisearch mini-dashboard (v0.2.2) ; Meilisearch instant-meilisearch (v0.8.2) ; Meilisearch JavaScript (v0.27.0)' \
-H 'sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Windows"' \
--data-raw '{"q":"halo","attributesToHighlight":["*"],"highlightPreTag":"<ais-highlight-0000000000>","highlightPostTag":"</ais-highlight-0000000000>","limit":21}' \
--compressed
```
```json
{
"hits": [
{
"id": 108761,
"title": "I Am... Yours: An Intimate Performance at Wynn Las Vegas",
"overview": "Filmed at the Encore Theater at Wynn Las Vegas, this extraordinary concert features performances of over 30 songs from Beyoncé’s three multi-platinum solo releases, Destiny’s Child catalog and a few surprises. This amazing concert includes the #1 hits, “Single Ladies (Put A Ring On It),” “If I Were A Boy,” “Halo,” “Sweet Dreams” and showcases a gut-wrenching performance of “That’s Why You’re Beautiful.” Included on \"I AM... YOURS An Intimate Performance At Wynn Las Vegas,\" is a biographical storytelling woven between many songs and exclusive behind-the-scenes footage.",
"genres": ["Music", "Documentary"],
"poster": "https://image.tmdb.org/t/p/w500/j8n1XQNfw874Ka7SS3HQLCVNBxb.jpg",
"release_date": 1258934400,
"_formatted": {
"id": "108761",
"title": "I Am... Yours: An Intimate Performance at Wynn Las Vegas",
"overview": "Filmed at the Encore Theater at Wynn Las Vegas, this extraordinary concert features performances of over 30 songs from Beyoncé’s three multi-platinum solo releases, Destiny’s Child catalog and a few surprises. This amazing concert includes the #1 hits, “Single Ladies (Put A Ring On It),” “If I Were A Boy,” “<ais-highlight-0000000000>Halo</ais-highlight-0000000000>,” “Sweet Dreams” and showcases a gut-wrenching performance of “That’s Why You’re Beautiful.” Included on \"I AM... YOURS An Intimate Performance At Wynn Las Vegas,\" is a biographical storytelling woven between many songs and exclusive behind-the-scenes footage.",
"genres": ["Music", "Documentary"],
"poster": "https://image.tmdb.org/t/p/w500/j8n1XQNfw874Ka7SS3HQLCVNBxb.jpg",
"release_date": "1258934400"
}
}
],
"estimatedTotalHits": 10,
"query": "halo",
"limit": 21,
"offset": 0,
"processingTimeMs": 2
}
```
![MeiliSearch UI](./meilisearch.jpg)
#### Algolia
```bash
curl 'https://og53ly1oqh-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.14.2)%3B%20Browser%20(lite)%3B%20docsearch%20(3.2.1)%3B%20docsearch-react%20(3.2.1)%3B%20docusaurus%20(2.1.0)&x-algolia-api-key=739f2a55c6d13d93af146c22a4885669&x-algolia-application-id=OG53LY1OQH' \
-H 'Accept: */*' \
-H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5' \
-H 'Connection: keep-alive' \
-H 'Origin: https://docs.halo.run' \
-H 'Referer: https://docs.halo.run/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: cross-site' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26' \
-H 'content-type: application/x-www-form-urlencoded' \
-H 'sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Windows"' \
--data-raw '{"requests":[{"query":"halo","indexName":"docs","params":"attributesToRetrieve=%5B%22hierarchy.lvl0%22%2C%22hierarchy.lvl1%22%2C%22hierarchy.lvl2%22%2C%22hierarchy.lvl3%22%2C%22hierarchy.lvl4%22%2C%22hierarchy.lvl5%22%2C%22hierarchy.lvl6%22%2C%22content%22%2C%22type%22%2C%22url%22%5D&attributesToSnippet=%5B%22hierarchy.lvl1%3A5%22%2C%22hierarchy.lvl2%3A5%22%2C%22hierarchy.lvl3%3A5%22%2C%22hierarchy.lvl4%3A5%22%2C%22hierarchy.lvl5%3A5%22%2C%22hierarchy.lvl6%3A5%22%2C%22content%3A5%22%5D&snippetEllipsisText=%E2%80%A6&highlightPreTag=%3Cmark%3E&highlightPostTag=%3C%2Fmark%3E&hitsPerPage=20&facetFilters=%5B%22language%3Azh-Hans%22%2C%5B%22docusaurus_tag%3Adefault%22%2C%22docusaurus_tag%3Adocs-default-1.6%22%5D%5D"}]}' \
--compressed
```
```json
{
"results": [
{
"hits": [
{
"content": null,
"hierarc
没有合适的资源?快使用搜索试试~ 我知道了~
Java SpringBoot Freemark 个人博客系统
共707个文件
java:606个
yaml:38个
properties:12个
需积分: 10 0 下载量 183 浏览量
2023-01-31
17:44:02
上传
评论
收藏 1MB ZIP 举报
温馨提示
是一个基于 SpringBoot、Freemark 开发的个人博客系统,内置了一个 Admin后台,可以可视化的管理博客,不需要繁琐的配置,不需要操心各种主题之间的兼容性,容器化启动只需要执行一个命令,剩下的都是图形界面可以搞定的事情。 它 有简约的界面和良好的生态环境、社区环境,有问题可以求助于社区,但要注意提问的艺术噢。 另外 Halo 的代码难度也不算高,但是格式对于初学者来说是非常友好的一个项目。
资源推荐
资源详情
资源评论
收起资源包目录
Java SpringBoot Freemark 个人博客系统 (707个子文件)
gradlew.bat 3KB
Dockerfile 881B
.editorconfig 21KB
examplefile 25B
.gitattributes 121B
.gitignore 835B
build.gradle 4KB
settings.gradle 159B
gradlew 8KB
error.html 3KB
index.html 201B
index.html 201B
index.html 170B
timezone.html 72B
timezone.html 72B
index.html 14B
test-plugin-components.idx 46B
gradle-wrapper.jar 58KB
PluginEndpoint.java 24KB
ThemeEndpoint.java 20KB
RequestInfoResolverTest.java 20KB
ReactiveExtensionClientTest.java 19KB
ThemeServiceImplTest.java 17KB
PluginEndpointTest.java 17KB
SinglePageReconciler.java 16KB
HaloProcessorDialectTest.java 16KB
CommentServiceImplTest.java 15KB
ThemeServiceImpl.java 15KB
PluginReconciler.java 15KB
HaloPluginManager.java 15KB
PostReconciler.java 15KB
AttachmentEndpoint.java 15KB
PostFinderImpl.java 14KB
CategoryFinderImplTest.java 14KB
UserServiceImplTest.java 14KB
PluginReconcilerTest.java 14KB
PostServiceImpl.java 13KB
PostEndpoint.java 13KB
CommentFinderImplTest.java 13KB
RadixTree.java 13KB
SystemSettingReconciler.java 13KB
ThemeReconcilerTest.java 12KB
SinglePageServiceImpl.java 12KB
PluginRequestMappingHandlerMappingTest.java 11KB
CommentFinderEndpoint.java 11KB
ThemeEndpointTest.java 11KB
SinglePageEndpoint.java 11KB
UserEndpointTest.java 10KB
UserEndpoint.java 10KB
CommentServiceImpl.java 10KB
PluginAutoConfiguration.java 10KB
DefaultControllerTest.java 9KB
ThemeMessageResolutionUtils.java 9KB
RadixTreeTest.java 9KB
ThemeReconciler.java 9KB
PostFinderImplTest.java 9KB
ExtensionConfigurationTest.java 9KB
SpringExtensionFactory.java 9KB
Unstructured.java 9KB
RadixRouterTree.java 9KB
DefaultController.java 9KB
AuthorizationTest.java 8KB
PermalinkIndexer.java 8KB
ThemeUtils.java 8KB
SinglePageReconcilerTest.java 8KB
MenuItemReconcilerTest.java 8KB
RoleBindingReconcilerTest.java 8KB
RequestInfoFactory.java 8KB
FileUtils.java 8KB
CommentReconciler.java 8KB
SystemSettingReconcilerTest.java 8KB
CommentFinderImpl.java 8KB
LucenePostSearchService.java 8KB
PostReconcilerTest.java 8KB
AttachmentEndpointTest.java 8KB
ThemeLocaleContextResolverTest.java 8KB
PermalinkHttpGetRouter.java 8KB
Post.java 7KB
LocalAttachmentUploadHandler.java 7KB
ReplyServiceImpl.java 7KB
DefaultUserDetailServiceTest.java 7KB
WebFluxConfig.java 7KB
CategoryReconciler.java 7KB
TrackerEndpoint.java 7KB
ReactiveExtensionClientImpl.java 7KB
PluginApplicationInitializer.java 7KB
CommentQueryTest.java 7KB
CommentSorterTest.java 7KB
ReverseProxyRouterFunctionFactory.java 7KB
ThemeMessageResolverIntegrationTest.java 7KB
AbstractContentService.java 6KB
PluginStartedListener.java 6KB
I18nExceptionTest.java 6KB
Role.java 6KB
YamlPluginFinderTest.java 6KB
SpringComponentsFinder.java 6KB
CategoryReconcilerTest.java 6KB
CategoryFinderImpl.java 6KB
AttachmentReconciler.java 6KB
MenuItemReconciler.java 6KB
共 707 条
- 1
- 2
- 3
- 4
- 5
- 6
- 8
资源评论
gdutxiaoxu
- 粉丝: 1535
- 资源: 3120
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功