I have set up elasticsearch with the following settings:
[indexer]
REPO_INDEXER_ENABLED = true
ISSUE_INDEXER_TYPE: elasticsearch
ISSUE_INDEXER_CONN_STR: http://localhost:9200
ISSUE_INDEXER_NAME: gitea_issues
I have created a test issue with the text "bla bla bla mr. freeman" and I am trying to find it using the issue search. I've done the same thing on the try.gitea.io test instance:
Issue: https://try.gitea.io/thedoginthewok/test_issue_search/issues/1
Search: https://try.gitea.io/issues?type=your_repositories&repos=%5B%5D&sort=&state=open&q=freema
On the test instance, the issue is successfully found. On my instance, I can only find the instance if I search for the complete word freeman.
Is there any way to configure a fuzzy search for elastic?
My instance with search term freeman:
My instance with search term freema:
Maybe you mean
[indexer]
REPO_INDEXER_ENABLED = true
ISSUE_INDEXER_TYPE = elasticsearch
ISSUE_INDEXER_CONN_STR = http://localhost:9200
ISSUE_INDEXER_NAME = gitea_issues
I've changed it to "=", but it behaves the same.
New log gist: https://gist.github.com/thedoginthewok/eaa51d81d8a82f13145ff7be1c56888b
This part is interesting to me:
2020/06/19 15:25:41 ...elastic/v7/client.go:848:dumpRequest() [T] POST /gitea_issues/_search HTTP/1.1\01503d
Host: localhost:9200\01503d
User-Agent: elastic/7.0.9 (linux-amd64)\01503d
Transfer-Encoding: chunked\01503d
Accept: application/json\01503d
Content-Type: application/json\01503d
Accept-Encoding: gzip\01503d
\01503d
b5\01503d
{"from":0,"query":{"bool":{"must":[{"multi_match":{"fields":["title","content","comments"],"query":"freema"}},{"terms":{"repo_id":[1]}}]}},"size":50,"sort":[{"id":{"order":"asc"}}]}\01503d
0\01503d
\01503d
2020/06/19 15:25:41 ...elastic/v7/client.go:858:dumpResponse() [T] HTTP/1.1 200 OK\01503d
Content-Type: application/json; charset=UTF-8\01503d
\01503d
{"took":4,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":0,"relation":"eq"},"max_score":null,"hits":[]}}
Try searching freema*
Nope.
What is the try.gitea.io instance running on?
It just uses database search
So, probably with LIKE '%SEARCHTERM%'
.
This is a bug in the elasticsearch indexer, right?
Or is it supposed to work this way?
Elastic search query should be improved
That's because how we use elastic search. Below is the configuration from the source,
"mappings": {
"properties": {
"id": {
"type": "integer",
"index": true
},
"repo_id": {
"type": "integer",
"index": true
},
"title": {
"type": "text",
"index": true
},
"content": {
"type": "text",
"index": true
},
"comments": {
"type" : "text",
"index": true
}
}
}
We should change the configuration to resolve the problem?
This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.
This issue has been automatically closed because of inactivity. You can re-open it if needed.
unstale
Most helpful comment
Elastic search query should be improved