Skip to content

abhicodes07/Data-Science

Repository files navigation


Welcome to the Data Science repository! Here, I have curated a collection of all the basics of data science which includes Basics of Python, NumPy, Pandas and MatPlotLib. All the jupyter notebooks and program implementations are solely based on examples from the book Python for Data Analysis by Wes McKinney.
Feel free to explore the repository and reach out in case of any issues.

☘️ Contents

np NumPy

NumPy , short for Numerical Python, is the fundamental module required for high-level computation and data-analysis. Here, You will find all the necessary and important topic-wise notes on understanding Numpy from beginner to advance.

Topic Description Link
1. Numpy Basics An introduction of ndarrys and thier vectorized computation. 🔗
2. Data Types Understanding NumPy data types or dtype. 🔗
3. Indexing and Slicing Selecting a subset of a data or individual elemenst using indexing and slicing. 🔗
4. Boolean Indexing Accessing elements of an array using another array as index. 🔗
5. Fancy Indexing Accessing elements with desired order using Indexing with ndarrays of integers. 🔗
6. Transposing and Swapping Operations to flip the dimensions of an array. 🔗
7. Universal functions Functions that performs elementwise operations on data in ndarrays. 🔗
8. Data Processing Includes tasks such as concise array expressions that might require using loops. 🔗
9. Conditional logic Express conditional logic as array operation. 🔗
10. Statistical Methods Mathematical functions which computes statistics about an entire array. 🔗
11. Boolean Methods Boolean values constitute of 1 (True) and 0 (False). 🔗
12. Sorting Sorting arrays using sort methods. 🔗
13. Unique and Set Logic Some basic set operations for one-dimensional ndarrays. 🔗
14. File Input and Output Save and load data to and from disk either in text or binary format. 🔗
15. Linear Algebra Useful Linear Algebric in-built functions in numpy namespace. 🔗
16. Random number generation generates whole arrays of smaple values from many knds of probability distributions. 🔗

pd Pandas

Pandas is a powerful and popular open-source library in Python used for data manipulation and analysis. It provides data structures and functions that make it easy to work with structured data, particularly for tasks like data cleaning, transformation, and exploration.

Topic Description Link
1. Introduction Introduction to Pandas and It's Data Structures. 🔗
2. Series A series is one-dimensional array like object containing an array of data (of any NumPy data-type) and an associated array of data labels, called it's index. 🔗
3. DataFrame A DataFrame represents a tabular, spreadsheet-like data structure containing an ordered collection of columns. 🔗

About

Jupyter notebooks and necessary resources for data science and machine learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published