Materials and IPython notebooks for "Python for Data Analysis, 3rd Edition" by Wes McKinney, published by O'Reilly Media. Book content including updates and errata fixes can be found for free on my website.
uv is a fast Python package installer and resolver. To get started:
- Install uv if you haven't already:
curl -LsSf https://astral.sh/uv/install.sh | sh
- Start Jupyter Notebook with all dependencies:
uv run jupyter notebook
That's it! uv will automatically create a virtual environment and install all required packages from pyproject.toml
.
- Create a new conda environment:
conda create -n pydata-book python=3.11
conda activate pydata-book
- Install dependencies:
pip install -r requirements.txt
- Start Jupyter Notebook:
jupyter notebook
Note: This project uses pandas 2.0.3 to ensure compatibility with the notebooks.
If you are reading the 2nd Edition (published in 2017), please find the
reorganized book materials on the 2nd-edition
branch.
If you are reading the 1st Edition (published in 2012), please find the
reorganized book materials on the 1st-edition
branch.
- Chapter 2: Python Language Basics, IPython, and Jupyter Notebooks
- Chapter 3: Built-in Data Structures, Functions, and Files
- Chapter 4: NumPy Basics: Arrays and Vectorized Computation
- Chapter 5: Getting Started with pandas
- Chapter 6: Data Loading, Storage, and File Formats
- Chapter 7: Data Cleaning and Preparation
- Chapter 8: Data Wrangling: Join, Combine, and Reshape
- Chapter 9: Plotting and Visualization
- Chapter 10: Data Aggregation and Group Operations
- Chapter 11: Time Series
- Chapter 12: Introduction to Modeling Libraries in Python
- Chapter 13: Data Analysis Examples
- Appendix A: Advanced NumPy
The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.