Skip to content

Commit dca00bc

Browse files
Index / Add danish language. (geonetwork#7697)
Co-authored-by: Jose García <[email protected]>
1 parent 7dd47a9 commit dca00bc

File tree

2 files changed

+77
-0
lines changed

2 files changed

+77
-0
lines changed

docs/manual/docs/customizing-application/configuring-search-fields.md

+41
Original file line numberDiff line numberDiff line change
@@ -248,4 +248,45 @@ By default, the search score is defined as (see `web-ui/src/main/resources/catal
248248
249249
## Language analyzer
250250
251+
251252
By default a `standard` analyzer is used. If the catalog content is english, it may make sense to change the analyzer to `english`. To customize the analyzer see `web/src/main/webResources/WEB-INF/data/config/index/records.json`
253+
254+
To add a new language, check first if Elasticsearch provides a specific analyzer for that language (see https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html). Then configure fields that are multilingual
255+
in `records.json` (eg. adding Danish):
256+
257+
* If the field is used for full text search, use the language analyzer:
258+
259+
```json
260+
{
261+
"textField": {
262+
"match": "*Object",
263+
"mapping": {
264+
"type": "object",
265+
"properties": {
266+
"default": {},
267+
...
268+
"langdan": {
269+
"type": "text",
270+
"analyzer": "danish"
271+
},
272+
```
273+
274+
* If the field is a keyword like organisation name or tag field use type `keyword` (which is required for computing aggregations)
275+
276+
```json
277+
{
278+
"tag": {
279+
"match": "th_*",
280+
"mapping": {
281+
"type": "object",
282+
"copy_to": ["tag"],
283+
"properties": {
284+
"default": {},
285+
...
286+
"langdan": {
287+
"type": "keyword",
288+
"copy_to": [
289+
"any.langdan"
290+
]
291+
},
292+
```

web/src/main/webResources/WEB-INF/data/config/index/records.json

+36
Original file line numberDiff line numberDiff line change
@@ -1085,6 +1085,10 @@
10851085
"type": "keyword",
10861086
"copy_to": ["any.langdut", "organisationName.langdut"]
10871087
},
1088+
"langdan": {
1089+
"type": "keyword",
1090+
"copy_to": ["any.langdan", "organisationName.langdan"]
1091+
},
10881092
"langspa": {
10891093
"type": "keyword",
10901094
"copy_to": ["any.langspa", "organisationName.langspa"]
@@ -1172,6 +1176,19 @@
11721176
}
11731177
}
11741178
},
1179+
"langdan": {
1180+
"type": "text",
1181+
"analyzer": "danish",
1182+
"copy_to": [
1183+
"any.langdan"
1184+
],
1185+
"fields": {
1186+
"keyword": {
1187+
"type": "keyword",
1188+
"ignore_above": ${es.index.ignore_above}
1189+
}
1190+
}
1191+
},
11751192
"langita": {
11761193
"type": "text",
11771194
"analyzer": "italian",
@@ -1286,6 +1303,12 @@
12861303
"any.langdut"
12871304
]
12881305
},
1306+
"langdan": {
1307+
"type": "keyword",
1308+
"copy_to": [
1309+
"any.langdan"
1310+
]
1311+
},
12891312
"langita": {
12901313
"type": "keyword",
12911314
"copy_to": ["any.langita"]
@@ -1471,6 +1494,10 @@
14711494
"type": "text",
14721495
"analyzer": "dutch"
14731496
},
1497+
"langdan": {
1498+
"type": "text",
1499+
"analyzer": "danish"
1500+
},
14741501
"langita": {
14751502
"type": "text",
14761503
"analyzer": "italian"
@@ -1513,6 +1540,12 @@
15131540
"any.langdut"
15141541
]
15151542
},
1543+
"langdan": {
1544+
"type": "keyword",
1545+
"copy_to": [
1546+
"any.langdan"
1547+
]
1548+
},
15161549
"langita": {
15171550
"type": "keyword",
15181551
"copy_to": ["any.langita"]
@@ -2045,6 +2078,9 @@
20452078
"langdut": {
20462079
"type": "keyword"
20472080
},
2081+
"langdan": {
2082+
"type": "keyword"
2083+
},
20482084
"langspa": {
20492085
"type": "keyword"
20502086
}

0 commit comments

Comments
 (0)