Welcome to the Data Science repository! Here, I have curated a collection of all the basics of data science which includes Basics of Python, NumPy, Pandas and MatPlotLib. All the jupyter notebooks and program implementations are solely based on examples from the book Python for Data Analysis by Wes McKinney.
Feel free to explore the repository and reach out in case of any issues.
NumPy , short for Numerical Python, is the fundamental module required for high-level computation and data-analysis.
Here, You will find all the necessary and important topic-wise notes on understanding Numpy from beginner to advance.
| Topic | Description | Link | |
|---|---|---|---|
| 1. | Numpy Basics | An introduction of ndarrys and thier vectorized computation. |
🔗 |
| 2. | Data Types | Understanding NumPy data types or dtype. |
🔗 |
| 3. | Indexing and Slicing | Selecting a subset of a data or individual elemenst using indexing and slicing. | 🔗 |
| 4. | Boolean Indexing | Accessing elements of an array using another array as index. | 🔗 |
| 5. | Fancy Indexing | Accessing elements with desired order using Indexing with ndarrays of integers. | 🔗 |
| 6. | Transposing and Swapping | Operations to flip the dimensions of an array. | 🔗 |
| 7. | Universal functions | Functions that performs elementwise operations on data in ndarrays. | 🔗 |
| 8. | Data Processing | Includes tasks such as concise array expressions that might require using loops. | 🔗 |
| 9. | Conditional logic | Express conditional logic as array operation. | 🔗 |
| 10. | Statistical Methods | Mathematical functions which computes statistics about an entire array. | 🔗 |
| 11. | Boolean Methods | Boolean values constitute of 1 (True) and 0 (False). | 🔗 |
| 12. | Sorting | Sorting arrays using sort methods. | 🔗 |
| 13. | Unique and Set Logic | Some basic set operations for one-dimensional ndarrays. | 🔗 |
| 14. | File Input and Output | Save and load data to and from disk either in text or binary format. | 🔗 |
| 15. | Linear Algebra | Useful Linear Algebric in-built functions in numpy namespace. | 🔗 |
| 16. | Random number generation | generates whole arrays of smaple values from many knds of probability distributions. | 🔗 |
Pandas is a powerful and popular open-source library in Python used for data manipulation and analysis. It provides data structures and functions that make it easy to work with structured data, particularly for tasks like data cleaning, transformation, and exploration.
| Topic | Description | Link | |
|---|---|---|---|
| 1. | Introduction | Introduction to Pandas and It's Data Structures. | 🔗 |
| 2. | Series | A series is one-dimensional array like object containing an array of data (of any NumPy data-type) and an associated array of data labels, called it's index. | 🔗 |
| 3. | DataFrame | A DataFrame represents a tabular, spreadsheet-like data structure containing an ordered collection of columns. | 🔗 |

