You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/summary3.rst
+31-10Lines changed: 31 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,12 @@ Summary day 3
6
6
7
7
.. keypoints::
8
8
9
+
- Intro to matplotlib
10
+
- Matplotlib is the essential Python data visualization package, with nearly 40 different plot types to choose from depending on the shape of your data and which qualities you want to highlight.
11
+
- Almost every plot will start by instantiating the figure, ``fig`` (the blank canvas), and 1 or more axes objects, ``ax``, with ``fig, ax = plt.subplots(*args, **kwargs)``.
12
+
- There are several ways to tile subplots depending on how many there are, how they are shaped, and whether they require non-Cartesian coordinate systems.
13
+
- Most of the plotting and formatting commands you will use are methods of ``Axes`` objects. (A few, like ``colorbar`` are methods of the ``Figure``, and some commands are methods both.)
9
14
- Intro to Pandas
10
-
11
15
- Lets you construct list- or table-like data structures with mixed data types, the contents of which can be indexed by arbitrary row and column labels
12
16
- The main data structures are Series (1D) and DataFrames (2D). Each column of a DataFrame is a Series
13
17
@@ -16,14 +20,31 @@ Summary day 3
16
20
17
21
- Seaborn plotting functions take in a Pandas DataFrame, sometimes the names of variables in the DataFrame to extract as x and y, and often a hue that makes different subsets of the data appear in different colors depending on the value of the given categorical variable.
18
22
19
-
- Batch mode
20
-
- The SLURM scheduler handles allocations to the calculation nodes
21
-
- Batch jobs runs without interaction with user
22
-
- A batch script consists of a part with *SLURM parameters* describing the allocation and a second part describing the actual work within the job, for instance one or several Python scripts.
23
-
- Remember to include possible input arguments to the Python script in the batch script.
24
-
25
23
- Big data
24
+
- Allocate more RAM by asking for
25
+
26
+
- Several cores
27
+
- Nodes will more RAM
28
+
- Check job memory usage with ``sacct`` or ``sstat``. Check you documentation!
29
+
- File formats
30
+
31
+
- No format fits all requirements
32
+
- HDF5 and NetCDF good for Big data since it allows loading parts of the file into memory
33
+
- Store temporary data in local scratch ($SNIC_TMP).
34
+
- Packages
26
35
27
-
- allocate resources sufficient to data size
28
-
- decide on useful file formats
29
-
- use data-chunking as technique
36
+
- xarray
37
+
38
+
- can deal with 3D-data and higher dimensions
39
+
- Dask
40
+
41
+
- uses lazy execution
42
+
- Only use for processing very large amount of data
43
+
- Chunking: Data source → Format choice → Load/Chunk → Process → Write
44
+
45
+
- Batch mode
46
+
- The SLURM scheduler handles allocations to the calculation nodes
47
+
- Batch jobs runs without interaction with user
48
+
- A batch script consists of a part with *SLURM parameters* describing the allocation and a second part describing the actual work within the job, for instance one or several Python scripts.
49
+
- Remember to include possible input arguments to the Python script in the batch script.
0 commit comments