Add chinese translation (#772)

* add chinese config and i18n

* Import translation from https://bitbucket.org/mastodoncn/mastodon-zh-cn/src/master/

* translate Using Mastodon

* Translate Running Mastodon
Add Chinese search optimization
Fix some bugs

* substitute 长毛象 to Mastodon

* do some modifications.
This commit is contained in:
yingziwu 2020-05-21 02:54:12 +08:00 committed by GitHub
parent 02839e07ce
commit b6e011265d
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
38 changed files with 3176 additions and 65 deletions

View file

@ -32,7 +32,7 @@ apt install elasticsearch
```
{{< hint style="warning" >}}
**Security warning:** By default, ElasticSearch is supposed to bind to localhost only, i.e. be inaccessible from the outside network. You can check which address ElasticSearch binds to by looking at `network.host` within `/etc/elasticsearch/elasticsearch.yml`. Consider that anyone who can access ElasticSearch can access and modify any data within it, as there is no authentication layer. So its really important that the access is secured. Having a firewall that only exposes the 22, 80 and 443 ports is advisable, as outlined in the [main installation instructions](). If you have a multi-host setup, you must know how to secure internal traffic.
**Security warning:** By default, ElasticSearch is supposed to bind to localhost only, i.e. be inaccessible from the outside network. You can check which address ElasticSearch binds to by looking at `network.host` within `/etc/elasticsearch/elasticsearch.yml`. Consider that anyone who can access ElasticSearch can access and modify any data within it, as there is no authentication layer. So its really important that the access is secured. Having a firewall that only exposes the 22, 80 and 443 ports is advisable, as outlined in the [main installation instructions](../../prerequisites/#install-a-firewall-and-only-whitelist-ssh-http-and-https-ports). If you have a multi-host setup, you must know how to secure internal traffic.
{{< /hint >}}
To start ElasticSearch:
@ -77,3 +77,82 @@ RAILS_ENV=production bundle exec rake chewy:sync
**Compatibility note:** There is a known bug in Ruby 2.6.0 that prevents the above task from working. Other versions of Ruby, such as 2.6.1, are fine.
{{< /hint >}}
## Search optimization for other languages
### Chinese search optimization {#chinese-search-optimization}
The default analyzer of the ElasticSearch is the standard analyzer, which may not be the best especially for Chinese. To improve search experience, you can install language specific analyzer. Before create the index in ElasticSearch:
Installing [elasticsearch-analysis-ik](https://github.com/medcl/elasticsearch-analysis-ik), [elasticsearch-analysis-stconvert](https://github.com/medcl/elasticsearch-analysis-stconvert) to ElasticSearch.
And do modify as follows:
```diff
diff --git a/app/chewy/accounts_index.rb b/app/chewy/accounts_index.rb
--- a/app/chewy/accounts_index.rb
+++ b/app/chewy/accounts_index.rb
@@ -4,7 +4,7 @@ class AccountsIndex < Chewy::Index
settings index: { refresh_interval: '5m' }, analysis: {
analyzer: {
content: {
- tokenizer: 'whitespace',
+ tokenizer: 'ik_max_word',
filter: %w(lowercase asciifolding cjk_width),
},
diff --git a/app/chewy/statuses_index.rb b/app/chewy/statuses_index.rb
--- a/app/chewy/statuses_index.rb
+++ b/app/chewy/statuses_index.rb
@@ -16,9 +16,17 @@ class StatusesIndex < Chewy::Index
language: 'possessive_english',
},
},
+ char_filter: {
+ tsconvert: {
+ type: 'stconvert',
+ keep_both: false,
+ delimiter: '#',
+ convert_type: 't2s',
+ },
+ },
analyzer: {
content: {
- tokenizer: 'uax_url_email',
+ tokenizer: 'ik_max_word',
filter: %w(
english_possessive_stemmer
lowercase
@@ -27,6 +35,7 @@ class StatusesIndex < Chewy::Index
english_stop
english_stemmer
),
+ char_filter: %w(tsconvert),
},
},
}
diff --git a/app/chewy/tags_index.rb b/app/chewy/tags_index.rb
--- a/app/chewy/tags_index.rb
+++ b/app/chewy/tags_index.rb
@@ -2,10 +2,19 @@
class TagsIndex < Chewy::Index
settings index: { refresh_interval: '15m' }, analysis: {
+ char_filter: {
+ tsconvert: {
+ type: 'stconvert',
+ keep_both: false,
+ delimiter: '#',
+ convert_type: 't2s',
+ },
+ },
analyzer: {
content: {
- tokenizer: 'keyword',
+ tokenizer: 'ik_max_word',
filter: %w(lowercase asciifolding cjk_width),
+ char_filter: %w(tsconvert),
},
edge_ngram: {
```