Skip to content

Commit 79f48fb

Browse files
committed
More cleanup for first phase of x-plugins 986.
1 parent d7c66d5 commit 79f48fb

32 files changed

+81
-72
lines changed

010_Intro/05_What_is_it.asciidoc

+11-4
Original file line numberDiff line numberDiff line change
@@ -33,14 +33,21 @@ and hides complicated search theory away from beginners. It _just works_,
3333
right out of the box. With minimal understanding, you can soon become
3434
productive.((("Elasticsearch", "installing")))
3535

36-
Elasticsearch can be((("Apache 2 license"))) downloaded, used, and modified free of charge. It is
37-
available under the http://www.apache.org/licenses/LICENSE-2.0.html[Apache 2 license],
38-
one of the most flexible open source licenses available.
39-
4036
As your knowledge grows, you can leverage more of Elasticsearch's advanced
4137
features. The entire engine is configurable and flexible. Pick and choose
4238
from the advanced features to tailor Elasticsearch to your problem domain.
4339

40+
You can ((("Apache 2 license"))) download, use, and modify Elasticsearch free of charge.
41+
It is available under the http://www.apache.org/licenses/LICENSE-2.0.html[Apache 2 license],
42+
one of the most flexible open source licenses available. The source is hosted on GitHub
43+
at https://github.com/elastic/elasticsearch[github.com/elastic/elasticsearch]. See
44+
https://github.com/elastic/elasticsearch/blob/master/CONTRIBUTING.md[Contributing to
45+
Elasticsearch] if you would like to join our amazing community of contributors!
46+
47+
If you have any questions related to Elasticsearch, including specific features,
48+
language clients and plugins, join the conversation at
49+
https://discuss.elastic.co[discuss.elastic.co].
50+
4451
.The Mists of Time
4552
***************************************
4653

010_Intro/10_Installing_ES.asciidoc

+17-17
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,17 @@ The only requirement for installing Elasticsearch is a recent version of Java.
77
Preferably, you should install the latest version of the((("Java", "installing"))) official Java
88
from http://www.java.com[_www.java.com_].
99

10-
You can download the latest version of Elasticsearch from
10+
You can get the latest version of Elasticsearch from
1111
https://www.elastic.co/downloads/elasticsearch[_elasticsearch.co/downloads/elasticsearch_].
1212

13-
[source,sh]
14-
--------------------------------------------------
15-
curl -L -O http://download.elastic.co/PATH/TO/VERSION.zip <1>
16-
unzip elasticsearch-$VERSION.zip
17-
cd elasticsearch-$VERSION
18-
--------------------------------------------------
19-
<1> Fill in the URL for the latest version available on
20-
http://www.elastic.co/downloads/elasticsearch[_elastic.co/downloads/elasticsearch_].
13+
To install Elasticsearch, download and extract the archive file for your platform. For
14+
more information, see the {ref}/_installation.html[Installation] topic in the Elasticsearch
15+
Reference.
2116
2217
[TIP]
2318
====
24-
When installing Elasticsearch in production, you can use the method
25-
described previously, or the Debian or RPM packages provided on the
19+
When installing Elasticsearch in production, you can choose to use
20+
the Debian or RPM packages provided on the
2621
http://www.elastic.co/downloads/elasticsearch[downloads page]. You can also use
2722
the officially supported
2823
https://github.com/elasticsearch/puppet-elasticsearch[Puppet module] or
@@ -43,8 +38,7 @@ You do not have to install Marvel, but it will make this book much more
4338
interactive by allowing you to experiment with the code samples on your local
4439
Elasticsearch cluster.
4540
46-
Marvel is available as a plug-in.((("Marvel", "downloading and installing"))) To download and install it, run this command
47-
in the Elasticsearch directory:
41+
Marvel is available as a plug-in.((("Marvel", "downloading and installing"))) To download and install it, run this command in the Elasticsearch directory:
4842
4943
[source,sh]
5044
--------------------------------------------------
@@ -64,14 +58,15 @@ echo 'marvel.agent.enabled: false' >> ./config/elasticsearch.yml
6458
[[running-elasticsearch]]
6559
=== Running Elasticsearch
6660
67-
Elasticsearch is now ready to run. ((("Elasticsearch", "running")))You can start it up in the foreground
68-
with this:
61+
Elasticsearch is now ready to run. ((("Elasticsearch", "running"))) To start it up in the foreground:
6962
7063
[source,sh]
7164
--------------------------------------------------
72-
./bin/elasticsearch
65+
cd elasticsearch-<version>
66+
./bin/elasticsearch <1> <2>
7367
--------------------------------------------------
74-
Add `-d` if you want to run it in the background as a daemon.
68+
<1> Add `-d` if you want to run it in the background as a daemon.
69+
<2> If you're running Elasticsearch on Windows, simply run `bin\elasticsearch.bat` instead.
7570
7671
Test it out by opening another terminal window and running the following:
7772
@@ -80,6 +75,11 @@ Test it out by opening another terminal window and running the following:
8075
curl 'http://localhost:9200/?pretty'
8176
--------------------------------------------------
8277
78+
TIP: If you're running Elasticsearch on Windows, you can download cURL from
79+
http://curl.haxx.se/download.html[`http://curl.haxx.se/download.html`]. cURL
80+
provides a convenient way to submit requests to Elasticsearch and
81+
installing cURL enables you to copy and paste many of the examples in this
82+
book to try them out.
8383
8484
You should see a response like this:
8585

020_Distributed_Cluster/15_Add_an_index.asciidoc

+2-3
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,8 @@ that you have determines the maximum amount of data that your index can hold.
2323

2424
[NOTE]
2525
====
26-
While there is no theoretical limit to the amount of data that a primary shard
27-
can hold, there is a practical limit. What constitutes the maximum shard size
28-
depends entirely on your use case: the hardware you have, the size and
26+
While a primary shard can technically contain up to Integer.MAX_VALUE - 128 documents,
27+
the practical limit depends on your use case: the hardware you have, the size and
2928
complexity of your documents, how you index and query your documents, and your
3029
expected response times.
3130
====

030_Data/10_Index.asciidoc

+2-2
Original file line numberDiff line numberDiff line change
@@ -89,13 +89,13 @@ field has been generated for us:
8989
{
9090
"_index": "website",
9191
"_type": "blog",
92-
"_id": "wM0OSFhDQXGZAWDf0-drSA",
92+
"_id": "AVFgSgVHUP18jI2wRx0w",
9393
"_version": 1,
9494
"created": true
9595
}
9696
--------------------------------------------------
9797

98-
Autogenerated IDs are 22 character long, URL-safe, Base64-encoded string
98+
Autogenerated IDs are 20 character long, URL-safe, Base64-encoded string
9999
_universally unique identifiers_, or((("UUIDs (universally unique identifiers)"))) http://en.wikipedia.org/wiki/Uuid[UUIDs].
100100

101101

040_Distributed_CRUD/10_Shard_interaction.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ We can send our requests to any node in the cluster.((("nodes", "sending request
1414
capable of serving any request. Every node knows the location of every
1515
document in the cluster and so can forward requests directly to the required
1616
node. In the following examples, we will send all of our requests to `Node 1`,
17-
which we will refer to as the _requesting node_.
17+
which we will refer to as the _coordinating node_.
1818

1919
TIP: When sending requests, it is good practice to round-robin through all the
2020
nodes in the cluster, in order to spread the load.

040_Distributed_CRUD/15_Create_index_delete.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ delete a document on both the primary and any replica shards:
2121
3. `Node 3` executes the request on the primary shard. If it is successful,
2222
it forwards the request in parallel to the replica shards on `Node 1` and
2323
`Node 2`. Once all of the replica shards report success, `Node 3` reports
24-
success to the requesting node, which reports success to the client.
24+
success to the coordinating node, which reports success to the client.
2525

2626
By the time the client receives a successful response, the document change has
2727
been executed on the primary shard and on all replica shards. Your change is

040_Distributed_CRUD/20_Retrieving.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ primary or replica shard:
1919
3. `Node 2` returns the document to `Node 1`, which returns the document
2020
to the client.
2121

22-
For read requests, the requesting node will choose a different shard copy on
22+
For read requests, the coordinating node will choose a different shard copy on
2323
every request in order to balance the load; it round-robins through all
2424
shard copies.
2525

040_Distributed_CRUD/25_Partial_updates.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ document:
2121
4. If `Node 3` has managed to update the document successfully, it forwards
2222
the new version of the document in parallel to the replica shards on `Node 1`
2323
and `Node 2` to be reindexed. Once all replica shards report success,
24-
`Node 3` reports success to the requesting node, which reports success to
24+
`Node 3` reports success to the coordinating node, which reports success to
2525
the client.
2626
2727
The `update` API also accepts the `routing`, `replication`, `consistency`, and

040_Distributed_CRUD/30_Bulk_requests.asciidoc

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
=== Multidocument Patterns
33

44
The patterns for the `mget` and `bulk` APIs((("mget (multi-get) API", "retrieving multiple documents, process of")))((("documents", "retrieving multiple with mget"))) are similar to those for
5-
individual documents. The difference is that the requesting node knows in
5+
individual documents. The difference is that the coordinating node knows in
66
which shard each document lives. It breaks up the multidocument request into
77
a multidocument request _per shard_, and forwards these in parallel to each
88
participating node.
@@ -44,7 +44,7 @@ The sequence of steps((("bulk API", "multiple document changes with")))((("docum
4444
action succeeds, the primary forwards the new document (or deletion) to its
4545
replica shards in parallel, and then moves on to the next action. Once all
4646
replica shards report success for all actions, the node reports success to
47-
the requesting node, which collates the responses and returns them to the
47+
the coordinating node, which collates the responses and returns them to the
4848
client.
4949

5050
The `bulk` API also accepts((("replication request parameter", "in bulk requests")))((("consistency request parameter", "in bulk requests"))) the `replication` and `consistency` parameters

050_Search/15_Pagination.asciidoc

+2-2
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,12 @@ to be sorted centrally to ensure that the overall order is correct.
3737
To understand why ((("deep paging, problems with")))deep paging is problematic, let's imagine that we are
3838
searching within a single index with five primary shards. When we request the
3939
first page of results (results 1 to 10), each shard produces its own top 10
40-
results and returns them to the _requesting node_, which then sorts all 50
40+
results and returns them to the _coordinating node_, which then sorts all 50
4141
results in order to select the overall top 10.
4242
4343
Now imagine that we ask for page 1,000--results 10,001 to 10,010. Everything
4444
works in the same way except that each shard has to produce its top 10,010
45-
results. The requesting node then sorts through all 50,050 results and
45+
results. The coordinating node then sorts through all 50,050 results and
4646
discards 50,040 of them!
4747
4848
You can see that, in a distributed system, the cost of sorting results

052_Mapping_Analysis/45_Mapping.asciidoc

+4-3
Original file line numberDiff line numberDiff line change
@@ -202,9 +202,10 @@ for an existing type) later, using the `/_mapping` endpoint.
202202

203203
[NOTE]
204204
================================================
205-
Although you can _add_ to an existing mapping, you can't _change_ it. If a field
206-
already exists in the mapping, the data from that
207-
field probably has already been indexed. If you were to change the field mapping, the already indexed data would be wrong and would not be properly searchable.
205+
Although you can _add_ to an existing mapping, you can't _change_ existing
206+
field mappings. If a mapping already exists for a field, data from that
207+
field has probably been indexed. If you were to change the field mapping,
208+
the indexed data would be wrong and would not be properly searchable.
208209
================================================
209210

210211
We can update a mapping to add a new field, but we can't change an existing

080_Structured_Search/05_term.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ GET /my_store/products/_search
143143
// SENSE: 080_Structured_Search/05_Term_text.json
144144

145145
Except there is a little hiccup: we don't get any results back! Why is
146-
that? The problem isn't with the the `term` query; it is with the way
146+
that? The problem isn't with the `term` query; it is with the way
147147
the data has been indexed. ((("analyze API, using to understand tokenization"))) If we use the `analyze` API (<<analyze-api>>), we
148148
can see that our UPC has been tokenized into smaller tokens:
149149

200_Language_intro/60_Mixed_language_fields.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ PUT /movies
128128
When querying the catchall `general` field, you can use
129129
`minimum_should_match` to reduce the number of low-quality matches. It may
130130
also be necessary to boost the other fields slightly more than the `general`
131-
field, so that matches on the the main language fields are given more weight
131+
field, so that matches on the main language fields are given more weight
132132
than those on the `general` field:
133133

134134
[source,js]

230_Stemming/10_Algorithmic_stemmers.asciidoc

-1
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,6 @@ documentation, which shows the following:
8787
--------------------------------------------------
8888
<1> The `keyword_marker` token filter lists words that should not be
8989
stemmed.((("keyword_marker token filter"))) This defaults to the empty list.
90-
9190
<2> The `english` analyzer uses two stemmers: the `possessive_english`
9291
and the `english` stemmer. The ((("english stemmer")))((("possessive_english stemmer")))possessive stemmer removes `'s`
9392
from any words before passing them on to the `english_stop`,

260_Synonyms/70_Symbol_synonyms.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ during tokenization.
88

99
While most punctuation is seldom important for full-text search, character
1010
combinations like emoticons((("emoticons"))) may be very signficant, even changing the meaning
11-
of the the text. Compare these:
11+
of the text. Compare these:
1212

1313
[role="pagebreak-before"]
1414
* I am thrilled to be at work on Sunday.

300_Aggregations/75_sigterms.asciidoc

+15-4
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,22 @@ Because the `significant_terms` aggregation((("significant_terms aggregation", "
55
statistics, you need to have a certain threshold of data for it to become effective.
66
That means we won't be able to index a small amount of example data for the demo.
77

8-
Instead, we have a pre-prepared dataset of around 80,000 documents. This is
9-
saved as a snapshot (for more information about snapshots and restore, see
10-
<<backing-up-your-cluster>>) in our public demo repository. You can "restore"
11-
this dataset into your cluster by using these commands:
8+
Instead, we prepared a dataset that contains about 80,000 documents and saved it
9+
as a snapshot in our public demo repository. To "restore"
10+
this dataset into your cluster:
1211

12+
. Add the following setting to your `elasticsearch.yml` configuration file to
13+
whitelist the Elastic demo repository:
14+
+
15+
[source,js]
16+
----
17+
repositories.url.allowed_urls: ["http://download.elastic.co/*"]
18+
----
19+
. Restart Elasticsearch.
20+
21+
. Run the following snapshot commands. (For more information about using
22+
snapshots, see <<backing-up-your-cluster, Backing Up Your Cluster>>.)
23+
+
1324
[source,js]
1425
----
1526
PUT /_snapshot/sigterms <1>

310_Geopoints/20_Geopoints.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geopoints]]
2-
== Geo-Points
2+
== Geo Points
33

44
A _geo-point_ is a single latitude/longitude point on the Earth's surface.((("geo-points"))) Geo-points
55
can be used to calculate distance from a point, to determine whether a point

310_Geopoints/30_Filter_by_geopoint.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[filter-by-geopoint]]
2-
=== Filtering by Geo-Point
2+
=== Filtering by Geo Point
33

44
Four geo-point filters ((("geo-points", "filtering by")))((("filtering", "by geo-points")))can be used to include or exclude documents by
55
geolocation:

310_Geopoints/32_Bounding_box.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-bounding-box]]
2-
=== geo_bounding_box Filter
2+
=== Geo Bounding Box Filter
33

44
This is by far the most efficient geo-filter because its calculation is very
55
simple. ((("geo_bounding_box filter")))((("filtering", "by geo-points", "geo_bounding_box filter"))) You provide it with the `top`, `bottom`, `left`, and `right`

310_Geopoints/34_Geo_distance.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-distance]]
2-
=== geo_distance Filter
2+
=== Geo Distance Filter
33

44
The `geo_distance` filter draws a circle around the specified location and
55
finds all documents((("geo_distance filter")))((("filtering", "by geo-points", "geo_distance filter"))) that have a geo-point within that circle:

310_Geopoints/36_Caching_geofilters.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-caching]]
2-
=== Caching geo-filters
2+
=== Caching Geo Filters
33

44
The results of geo-filters are not cached by default,((("caching", "of geo-filters")))((("filters", "caching geo-filters")))((("geo-filters, caching"))) for two reasons:
55

320_Geohashes/60_Geohash_cell_filter.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geohash-cell-filter]]
2-
=== geohash_cell Filter
2+
=== Geohash Cell Filter
33

44
The `geohash_cell` filter simply translates a `lat/lon` location((("geohash_cell filter")))((("filters", "geohash_cell"))) into a
55
geohash with the specified precision and finds all locations that contain

330_Geo_aggs/60_Geo_aggs.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-aggs]]
2-
== Geo-aggregations
2+
== Geo Aggregations
33

44
Although filtering or scoring results by geolocation is useful,((("geo-aggregations")))((("aggregations", "geo"))) it is often more
55
useful to be able to present information to the user on a map. A search may

330_Geo_aggs/62_Geo_distance_agg.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-distance-agg]]
2-
=== geo_distance Aggregation
2+
=== Geo Distance Aggregation
33

44
The `geo_distance` agg is useful((("geo_distance aggregation")))((("aggregations", "geo_distance"))) for searches such as
55
to "find all pizza restaurants within 1km of me." The search results

330_Geo_aggs/64_Geohash_grid_agg.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geohash-grid-agg]]
2-
=== geohash_grid Aggregation
2+
=== Geohash Grid Aggregation
33

44
The number of results returned by a query may be far too many to display each
55
geo-point individually on a map.((("geohash_grid aggregation")))((("aggregations", "geohash_grid"))) The `geohash_grid` aggregation buckets nearby

330_Geo_aggs/66_Geo_bounds_agg.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-bounds-agg]]
2-
=== geo_bounds Aggregation
2+
=== Geo Bounds Aggregation
33

44
In our <<geohash-grid-agg,previous example>>, we filtered our results by using a
55
bounding box that covered the greater New York area.((("aggregations", "geo_bounds")))((("geo_bounds aggregation"))) However, our results

340_Geoshapes/70_Geoshapes.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-shapes]]
2-
== Geo-shapes
2+
== Geo Shapes
33

44
Geo-shapes use a completely different approach than geo-points.((("geo-shapes"))) A circle on a
55
computer screen does not consist of a perfect continuous line. Instead it is

340_Geoshapes/72_Mapping_geo_shapes.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[mapping-geo-shapes]]
2-
=== Mapping geo-shapes
2+
=== Mapping Geo Shapes
33

44
Like fields of type `geo_point`, geo-shapes((("mapping (types)", "geo-shapes")))((("geo-shapes", "mapping"))) have to be mapped explicitly
55
before they can be used:

340_Geoshapes/74_Indexing_geo_shapes.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[indexing-geo-shapes]]
2-
=== Indexing geo-shapes
2+
=== Indexing Geo Shapes
33

44
Shapes are represented using http://geojson.org/[GeoJSON], a simple open
55
standard for encoding two-dimensional shapes in JSON.((("JSON", "shapes in (GeoJSON)")))((("shapes", see="geo-shapes")))((("GeoJSON")))((("geo-shapes", "indexing"))) Each shape definition

340_Geoshapes/76_Querying_geo_shapes.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[querying-geo-shapes]]
2-
=== Querying geo-shapes
2+
=== Querying Geo Shapes
33

44
The unusual thing ((("geo-shapes", "querying")))about the {ref}/query-dsl-geo-shape-query.html[`geo_shape` query] is that it allows us to query and filter using shapes, rather than just points.
55

340_Geoshapes/80_Caching_geo_shapes.asciidoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[[geo-shape-caching]]
2-
=== Geo-shape Filters and Caching
2+
=== Geo Shape Filters and Caching
33

44
The `geo_shape` query and filter perform the same function.((("caching", "geo-shape filters and")))((("filters", "geo_shape")))((("geo-shapes", "geo_shape filters, caching and"))) The query simply
55
acts as a filter: any matching documents receive a relevance `_score` of

0 commit comments

Comments
 (0)