Skip to content

Commit 4c07a01

Browse files
Extending the glossary (#7732)
* added align, broadcast,merge, concatenate, combine * examples added * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * changes made * add changes * . * . * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * changes done * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> * Update doc/user-guide/terminology.rst Co-authored-by: Tom Nicholas <[email protected]> --------- Co-authored-by: Tom Nicholas <[email protected]>
1 parent ab096b0 commit 4c07a01

File tree

2 files changed

+128
-0
lines changed

2 files changed

+128
-0
lines changed

doc/user-guide/terminology.rst

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,3 +131,128 @@ complete examples, please consult the relevant documentation.*
131131
``__array_ufunc__`` and ``__array_function__`` protocols are also required.
132132

133133
__ https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
134+
135+
.. ipython:: python
136+
:suppress:
137+
138+
import numpy as np
139+
import xarray as xr
140+
141+
Aligning
142+
Aligning refers to the process of ensuring that two or more DataArrays or Datasets
143+
have the same dimensions and coordinates, so that they can be combined or compared properly.
144+
145+
.. ipython:: python
146+
147+
x = xr.DataArray(
148+
[[25, 35], [10, 24]],
149+
dims=("lat", "lon"),
150+
coords={"lat": [35.0, 40.0], "lon": [100.0, 120.0]},
151+
)
152+
y = xr.DataArray(
153+
[[20, 5], [7, 13]],
154+
dims=("lat", "lon"),
155+
coords={"lat": [35.0, 42.0], "lon": [100.0, 120.0]},
156+
)
157+
x
158+
y
159+
160+
Broadcasting
161+
A technique that allows operations to be performed on arrays with different shapes and dimensions.
162+
When performing operations on arrays with different shapes and dimensions, xarray will automatically attempt to broadcast the
163+
arrays to a common shape before the operation is applied.
164+
165+
.. ipython:: python
166+
167+
# 'a' has shape (3,) and 'b' has shape (4,)
168+
a = xr.DataArray(np.array([1, 2, 3]), dims=["x"])
169+
b = xr.DataArray(np.array([4, 5, 6, 7]), dims=["y"])
170+
171+
# 2D array with shape (3, 4)
172+
a + b
173+
174+
Merging
175+
Merging is used to combine two or more Datasets or DataArrays that have different variables or coordinates along
176+
the same dimensions. When merging, xarray aligns the variables and coordinates of the different datasets along
177+
the specified dimensions and creates a new ``Dataset`` containing all the variables and coordinates.
178+
179+
.. ipython:: python
180+
181+
# create two 1D arrays with names
182+
arr1 = xr.DataArray(
183+
[1, 2, 3], dims=["x"], coords={"x": [10, 20, 30]}, name="arr1"
184+
)
185+
arr2 = xr.DataArray(
186+
[4, 5, 6], dims=["x"], coords={"x": [20, 30, 40]}, name="arr2"
187+
)
188+
189+
# merge the two arrays into a new dataset
190+
merged_ds = xr.Dataset({"arr1": arr1, "arr2": arr2})
191+
merged_ds
192+
193+
Concatenating
194+
Concatenating is used to combine two or more Datasets or DataArrays along a dimension. When concatenating,
195+
xarray arranges the datasets or dataarrays along a new dimension, and the resulting ``Dataset`` or ``Dataarray``
196+
will have the same variables and coordinates along the other dimensions.
197+
198+
.. ipython:: python
199+
200+
a = xr.DataArray([[1, 2], [3, 4]], dims=("x", "y"))
201+
b = xr.DataArray([[5, 6], [7, 8]], dims=("x", "y"))
202+
c = xr.concat([a, b], dim="c")
203+
c
204+
205+
Combining
206+
Combining is the process of arranging two or more DataArrays or Datasets into a single ``DataArray`` or
207+
``Dataset`` using some combination of merging and concatenation operations.
208+
209+
.. ipython:: python
210+
211+
ds1 = xr.Dataset(
212+
{"data": xr.DataArray([[1, 2], [3, 4]], dims=("x", "y"))},
213+
coords={"x": [1, 2], "y": [3, 4]},
214+
)
215+
ds2 = xr.Dataset(
216+
{"data": xr.DataArray([[5, 6], [7, 8]], dims=("x", "y"))},
217+
coords={"x": [2, 3], "y": [4, 5]},
218+
)
219+
220+
# combine the datasets
221+
combined_ds = xr.combine_by_coords([ds1, ds2])
222+
combined_ds
223+
224+
lazy
225+
Lazily-evaluated operations do not load data into memory until necessary.Instead of doing calculations
226+
right away, xarray lets you plan what calculations you want to do, like finding the
227+
average temperature in a dataset.This planning is called "lazy evaluation." Later, when
228+
you're ready to see the final result, you tell xarray, "Okay, go ahead and do those calculations now!"
229+
That's when xarray starts working through the steps you planned and gives you the answer you wanted.This
230+
lazy approach helps save time and memory because xarray only does the work when you actually need the
231+
results.
232+
233+
labeled
234+
Labeled data has metadata describing the context of the data, not just the raw data values.
235+
This contextual information can be labels for array axes (i.e. dimension names) tick labels along axes (stored as Coordinate variables) or unique names for each array. These labels
236+
provide context and meaning to the data, making it easier to understand and work with. If you have
237+
temperature data for different cities over time. Using xarray, you can label the dimensions: one for
238+
cities and another for time.
239+
240+
serialization
241+
Serialization is the process of converting your data into a format that makes it easy to save and share.
242+
When you serialize data in xarray, you're taking all those temperature measurements, along with their
243+
labels and other information, and turning them into a format that can be stored in a file or sent over
244+
the internet. xarray objects can be serialized into formats which store the labels alongside the data.
245+
Some supported serialization formats are files that can then be stored or transferred (e.g. netCDF),
246+
whilst others are protocols that allow for data access over a network (e.g. Zarr).
247+
248+
indexing
249+
:ref:`Indexing` is how you select subsets of your data which you are interested in.
250+
251+
- Label-based Indexing: Selecting data by passing a specific label and comparing it to the labels
252+
stored in the associated coordinates. You can use labels to specify what you want like "Give me the
253+
temperature for New York on July 15th."
254+
255+
- Positional Indexing: You can use numbers to refer to positions in the data like "Give me the third temperature value" This is useful when you know the order of your data but don't need to remember the exact labels.
256+
257+
- Slicing: You can take a "slice" of your data, like you might want all temperatures from July 1st
258+
to July 10th. xarray supports slicing for both positional and label-based indexing.

doc/whats-new.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,9 @@ Documentation
9797
(:pull:`7999`) By `Tom Nicholas <https://github.com/TomNicholas>`_.
9898
- Fixed broken links in "See also" section of :py:meth:`Dataset.count` (:issue:`8055`, :pull:`8057`)
9999
By `Articoking <https://github.com/Articoking>`_.
100+
- Extended the glossary by adding terms Aligning, Broadcasting, Merging, Concatenating, Combining, lazy,
101+
labeled, serialization, indexing (:issue:`3355`, :pull:`7732`)
102+
By `Harshitha <https://github.com/harshitha1201>`_.
100103

101104
Internal Changes
102105
~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)