Merge pull request #36 from MetOffice/develop

Develop
MetOffice · Dec 4, 2020 · d56cd08 · d56cd08
2 parents b0a9722 + 9d66188
commit d56cd08
Show file tree

Hide file tree

Showing 12 changed files with 19,256 additions and 4 deletions.
diff --git a/LICENCE b/LICENCE
@@ -0,0 +1,29 @@
+BSD 3-Clause Licence
+
+Copyright (c) 2020, Met Office
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+1. Redistributions of source code must retain the above copyright notice, this
+   list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation
+   and/or other materials provided with the distribution.
+
+3. Neither the name of the copyright holder nor the names of its
+   contributors may be used to endorse or promote products derived from
+   this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/README.md b/README.md
@@ -39,23 +39,32 @@ Worksheet | Aims
 [5](notebooks/worksheet5.ipynb) | <li>Have an appreciation for working with daily model data</li><li>Understand how to calculate some useful climate extremes statistics</li><li>Be aware of some coding stratagies for dealing with large data sets</li></ul>  
 [6](notebooks/worksheet6.ipynb) | An extended coding exercise designed to allow you to put everything you've learned into practise  
 
+Additional tutorials specific to the CSSP 20th Century reanalysis datasets:
+
+Worksheet | Aims
+:----: | -----------
+[CSSP 1](notebooks/CSSP_20CRDS_Tutorials/Introduction.ipynb) | <li>How to use a cloud based platform to analyse the 20CR-DS dataset</li><li>Settig up a python environment</li>
+[CSSP 2](notebooks/CSSP_20CRDS_Tutorials/tutorial_1_data_access.ipynb) | <li>How to load data into Xarrays format</li><li>How to convert the data xarrays into iris cube format</li><li>How to perform basic cube operations</li>
+[CSSP 3](notebooks/CSSP_20CRDS_Tutorials/tutorial_3_basic_analysis.ipynb) | <li>Calculate and visualise annual and monthly means</li><li>Calculate and visualise seasonal means</li><li>Calculate mean differences (anomalies)</li>
+[CSSP 4](notebooks/CSSP_20CRDS_Tutorials/tutorial_4_advance_analysis.ipynb) | <li>Calculate frequency of wet days</li><li>Calculate percentiles</li><li>Calculate some useful climate extremes statistics</li>
+
 Three additional worksheets are available for use by workshop instructors:
 
 * `makedata.ipynb`: Provides scripts for preparing raw model output for use in notebook exercises.
 * `worksheet_solutions.ipyn`: Solutions to worksheet exercices.
-* `worksheet6example.ipynb`: Example code for Worksheet 6.
+* `worksheet6example.ipynb`: Example code for Worksheet 6. 
 
 ## Data
-The data used in the worksheets is currently only available within the Met Office.  See the `data/README` for further details. 
+The data used in the worksheets is currently only available within the Met Office.  See the `data/README` for further details.
 
 ## Contributing
 Information on how to contribute can be found in the [Contributing guide](CONTRIBUTING.md).
 Please also consult the `CONTRIBUTING.ipynb` for information on formatting the worksheets in Jupyter Notebooks.  **Note** that we do not currently make use of Jupyter Lab as it doesn't currently support the types of html formatting we use in Jupyter Notebooks.
 
 ## Licence
-PyPRECIS is **not** currently licenced for use outside of the Met Office.
+PyPRECIS is licenced under BSD 3-clause licence for use outside of the Met Office.
 
 <h5 align="center">
 <img src="notebooks/img/MO_MASTER_black_mono_for_light_backg_RBG.png" width="200" alt="Met Office"> <br>
-&copy; British Crown Copyright 2018 - 2019, Met Office
+&copy; British Crown Copyright 2018 - 2020, Met Office
 </h5>
diff --git a/notebooks/CSSP_20CRDS_Tutorials/Introduction.ipynb b/notebooks/CSSP_20CRDS_Tutorials/Introduction.ipynb
@@ -0,0 +1,329 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# CSSP 20CR dataset - Tutorials"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Contents:\n",
+    "1. [Introduction](#introduction)\n",
+    "2. [Description of datasets](#dataset)\n",
+    "3. [Learning objectives](#objectives)\n",
+    "4. [Jupyter notebook](#notebook)\n",
+    "5. [Data format and python libraries](#libs)\n",
+    "6. [Instructions to create an environment](#env)\n",
+    "7. [Resources](#resources)\n",
+    "\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "___"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. Introduction<a id='introduction'></a>\n",
+    "\n",
+    "This short course is an introductory set of tutorials on accessing a large (~3Tb) dataset hosted on a cloud server. By putting the data and the computer resources in the same place, users no longer have to spend time downloading data, finding local storage for and manging the software needed to analyse the data. These notebooks explain how to use this cloud based platform to analyse the 20CR-DS dataset.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "___"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. Learning objectives<a id='objectives'></a>\n",
+    "\n",
+    "The high level learning objectives for these tutorials are:\n",
+    "- To access and explore variables of interest\n",
+    "- To convert data into different formats (xarrays and iris) \n",
+    "- To prepare data for analysis\n",
+    "- To carry out basic analyses\n",
+    "- To carry out advanced analysis\n",
+    "- To visualise the results  \n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3. Description of the tutorial dataset<a id='dataset'></a>\n",
+    "\n",
+    "A climate reanalysis gives a numerical description of the recent climate, produced by combining models with observations. The Twentieth Century Reanalysis Project (20CR-V2c) is a global reanalysis carried out by the National Oceanic and Atmospheric Administration (NOAA). The outputs from this dataset include temperature, pressure, winds, moisture, solar radiation and clouds, from the surface to the top of the atmosphere as far back as the mid-1800s. More information are available from [climate-reanalysis](https://www.ecmwf.int/en/research/climate-reanalysis) and [20CR-V2c](https://www.esrl.noaa.gov/psd/data/gridded/data.20thC_ReanV2c.html).\n",
+    "\n",
+    "At the UK Met Office we have increased the resolution of the 20CR-V2c reanalysis dataset using a process known as dynamical downscaling and it now covers the whole of China for the period 1851 to 2010 at a horizontal resolution of 25 km [(Amato et al., 2019)](https://doi.org/10.1175/JAMC-D-19-0083.1). (https://zenodo.org/record/2558135#.XJj2uaD7RWE). This work was funded through the Climate Science for Service Partnership China (CSSP-China) project.\n",
+    "\n",
+    "The Climate Science for Service Partnership China (CSSP China) is a scientific research project that is building the basis for services to support climate and weather resilient economic development and social welfare through strong, strategic partnerships harnessing UK scientific expertise. Through CSSP China (supported by the Newton Fund and the Department for Business, Energy & Industrial Strategy (BEIS) UK-China Research Innovation Partnership Fund) we are developing a strongly bilateral partnership between the Met Office, the China Meteorological Administration (CMA), the Institute of Atmospheric Physics (IAP) at the Chinese Academy of Sciences, and other key institutes within China and the UK. See the [CSSP-China](https://www.metoffice.gov.uk/research/approach/collaboration/newton/cssp-china/index) for more information.\n",
+    "\n",
+    "The dataset used in these tutorials include monthly, daily, 6 hourly, 3 hourly and hourly frequencies for the historical period of 1851-2010. The details of variables and frequencies can be found in [supplementary material](variableslist.pdf). \n",
+    "\n",
+    "The data is residing at **/data/share/cssp-data/ZARRSTORE/**\n",
+    "\n",
+    "\n",
+    "<figure>\n",
+    "  <img src=\"images/region.PNG\" alt=\"Trulli\" style=\"width:60%\">\n",
+    "  <figcaption>Figure: Downscaled domain of 20CR datasets (Amato et al., 2019)</figcaption>\n",
+    "</figure>\n",
+    "\n",
+    "\n",
+    "The area of interests are devided into seven subregions, shown in figure, are considered for analysis [Burke and Stott (2017)](https://journals.ametsoc.org/jcli/article/30/14/5205/97096/Impact-of-Anthropogenic-Climate-Change-on-the-East). The coordinates of these seven regions are: \n",
+    "\n",
+    "\n",
+    "\n",
+    "North Central (NC): 104°–113°E, 32°–39°N\n",
+    "\n",
+    "North East Coast (NEC): 113°–122°E, 32°–39°N\n",
+    "\n",
+    "North East (NE): 113°–131°E, 39°–44°N\n",
+    "\n",
+    "Tibetan Plateau (TP): 77°–104°E, 26°–36°N\n",
+    "\n",
+    "South Central (SC): 104°–113°E, 26°–32°N\n",
+    "\n",
+    "South East Coast (SEC): 113°–122°E, 26°–32°N\n",
+    "\n",
+    "South East (SE): 107°–120°E, 21°–26°N"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "___"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. Jupyter notebook<a id='notebook'></a>\n",
+    "Jupyter is an open source platform that contains a suite of tools including Jupyter Notebook: A browser-based application that allows you to create and share documents (i.e. Jupyter Notebook files such as this notbook you are reading right now!). These notebooks can contain simple text content and live code, equations, visualizations and narrative text. It is an Integrated Development Environment (IDE) that allows you to write code, navigate files on the system, inspect variables and more. The Jupyter Notebook file format (.ipynb ) allows you to combine descriptive text, code blocks and code output in a single file. You can then share the notebook itself with anyone who might want to run it and also convert the notebook to a PDF or HTML format that can be viewed like a report.\n",
+    "\n",
+    "##### How to run Jupyter Notebook\n",
+    "A Jupyter Notebook file (.ipynb) has three main parts, which are highlighted in the image below:\n",
+    "\n",
+    "- Menu bar\n",
+    "- Toolbar\n",
+    "- Cells\n",
+    "\n",
+    "Cells can be specified to store documentation text such as Markdown or programming code such as Python. Text written using the Markdown syntax can be rendered in a cell that is of the cell type Markdown. You can run code (e.g. Python) using the Code as cell type write you code and then either click on the run the selected cell button on top or use the Shift+Enter keyboard combination. When you run the code in a Code cell, the code output displayed below.\n",
+    "\n",
+    "**Example:** click on the cell below and press Shift+Enter (or Ctrl+Enter), It will print the output below the cell. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CSSP 20CR dataset\n"
+     ]
+    }
+   ],
+   "source": [
+    "print('CSSP 20CR dataset')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can also execute shell commands from the cell. For Example cell below list down the contents of ZARR dataset directory"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[0m\u001b[38;5;27mdaily\u001b[0m/  \u001b[48;5;10;38;5;21mmonthly\u001b[0m/\r\n"
+     ]
+    }
+   ],
+   "source": [
+    "ls /data/users/zmaalick/cssp/data/ZARRSTORE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Notice that **Shift+Enter** moves the cursor to next cell and by running the script with **Ctrl+Enter** the cursor stays in the same cell\n",
+    "\n",
+    "An important component of a Jupyter Notebook is its Kernel. A kernal runs your code in a specific programming language. Jupyter Notebook supports over 40 different languages. In this tutorials, we will use the Python kernel within the Jupyter Notebook IDE.\n",
+    "\n",
+    "To learn more about Jupyter Notebooks use the introductory free online course availabe from [Here](https://www.earthdatascience.org/courses/intro-to-earth-data-science/open-reproducible-science/jupyter-python/)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "___"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 5. Data format and python libraries<a id='libs'></a>\n",
+    "\n",
+    "##### ZARR\n",
+    "The data used in our tutorials have been converted from the [Met Office's PP file format](https://help.ceda.ac.uk/article/4424-pp-binary-forma) to Zarr. Zarr is a [specification](https://zarr.readthedocs.io/en/stable/spec.html) for how to store gridded data in a key-value interface (such as Amazon S3 object store), where each chunk of data is a separate value with a corresponding key indicating its position in the full dataset. This has advantage over NetCDF format as it allows for a highly parallel data access where many CPUs can simultaneously read different parts of the same dataset. Zarr is also a [Python library](https://zarr.readthedocs.io/en/stable/api.html) implementation of this specification that allows you to read and write data in a Zarr store.\n",
+    "\n",
+    "##### Iris\n",
+    "In order to explore and analyse our dataset in these tutorials we make use of a Python library called Iris. Iris is a key tool in the [SciTools](https://scitools.org.uk/) project which is a collaborative effort to produce and maintain python-based open-source tooks for Earth scientists. Iris is a useful toolkit as it supports read/write access to a range of data formats, including (CF-)netCDF, GRIB, and PP; fundamental data manipulation operations, such as arithmetic, interpolation, and statistics; and a range of integrated plotting options.  See [latest Iris documentation](https://scitools.org.uk/iris/docs/latest/) for more information.\n",
+    "\n",
+    "##### CATNIP\n",
+    "At Met Office we have also developed a python library called CATNIP (Climate Analysis Tools: Now In Python). This library is a collection of routines to make frequently used climate data analysis and visualisation tasks in Iris easier and quicker to perform. We will make use of some of CATNIP's routines in these tutorials. See [CATNIP documentaion](https://metoffice.github.io/CATNIP/#) for more information.\n",
+    "\n",
+    "##### CONDA\n",
+    "For these tutorials we have used  the [CONDA](https://docs.conda.io/en/latest/) package managment system to install the packages for our development environment. Next section contains the instructions you need to create a conda enviroment and install these packages."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "___"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 6. How to create an environment<a id='env'></a>\n",
+    "Run the following highlighted commands in your terminal to create the required conda environment and install the packages.\n",
+    "\n",
+    "- Ensure conda setup (needs doing only once) \n",
+    "    - *`conda init bash`*\n",
+    "    \n",
+    "    \n",
+    "- Create a new environment named: cssp-shared \n",
+    "    - *`conda env create -f /data/share/cssp-shared/environment.yml`*   \n",
+    "    \n",
+    "    \n",
+    "- Activate the environment. This will make all the required Python libraries available for you to use.\n",
+    "    - *`conda activate cssp`*\n",
+    "\n",
+    "\n",
+    "- Install your conda environment as an ipykerel (make sure your env is activated first, the kernel name will be cssp). This makes your Python environment available in the notebooks we will use for these tutorials.\n",
+    "    - *`python -m ipykernel install --name cssp --display-name cssp --user`*\n",
+    "     \n",
+    "**Initiate ipykernel in notebook**\n",
+    "\n",
+    "In order to initiate the \"cssp-shared\" kernel created above, follow the following steps:\n",
+    "\n",
+    "1. Open the notebook\n",
+    "2. Click on to \"No Kernel\"\n",
+    "3. Drop down box will appear to select the kernel. Select \"cssp-shared\" and click \"ok\"\n",
+    "4. \"cssp-shared\" will appear in the right corner instead of \"no kernel\". This means that the kernel has initiated and the notebook is now ready to use.\n",
+    "\n",
+    "\n",
+    "**Useful commands**\n",
+    "\n",
+    "- List the packages to see if all are installed \n",
+    "    - *`conda list`*\n",
+    "\n",
+    "\n",
+    "- See the available environments \n",
+    "    - *`conda env list`*\n",
+    "\n",
+    "\n",
+    "- To deactivate the active environment  \n",
+    "    - *`conda deactivate`*\n",
+    "\n",
+    "\n",
+    "- To remove an environment completely\n",
+    "    - *`conda remove --name env_name --all`*\n",
+    "\n",
+    "\n",
+    "- To list jupyter kernels\n",
+    "    - *`jupyter kernelspec list`*\n",
+    "\n",
+    "\n",
+    "- To uninstall a kernel \n",
+    "    - *`jupyter kernelspec uninstall [unwanted-kernel]`*\n",
+    "\n",
+    "\n",
+    "- To rename an env \n",
+    "    - *`conda create --name new_name --copy --clone old_name`*\n",
+    "    \n",
+    "\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "___"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Resources<a id='resources'></a>\n",
+    "\n",
+    "The following are the links you can follow for further information for the pacakges that we have installed and use in these tutorials.\n",
+    "\n",
+    "- [python](https://docs.python.org/3/library/)\n",
+    "- [zarr](https://zarr.readthedocs.io/en/stable/)\n",
+    "- [iris](https://scitools.org.uk/iris/docs/latest/)\n",
+    "- [numpy](https://numpy.org/)\n",
+    "- [matplotlib](https://matplotlib.org/)\n",
+    "- [xarray](http://xarray.pydata.org/en/stable/)\n",
+    "- [jupyterlab](https://jupyterlab.readthedocs.io/en/stable/)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}