-
Notifications
You must be signed in to change notification settings - Fork 44
Description
Assignment: Data Analysis Using Pandas
Objective
The objective of this assignment is to test your ability to clean, manipulate, analyze, and visualize data using the Pandas library in Python.
Dataset Description
You are provided with a dataset named ecommerce_data.csv
containing the following columns:
Order ID
: Unique identifier for each order.Product
: Name of the product purchased.Category
: Product category (e.g., Electronics, Apparel, etc.).Quantity Ordered
: Quantity of the product ordered.Price Each
: Price per unit of the product.Order Date
: Date and time when the order was placed.City
: City where the order was placed.Customer ID
: Unique identifier for the customer.
Assignment Questions
-
Data Cleaning and Preparation
a. Load the dataset into a Pandas DataFrame and display the first five rows.
b. Check for missing or null values in the dataset. Handle these appropriately.
c. Convert columns to appropriate data types (e.g.,Order Date
to datetime).
d. Create a new column,Total Price
, which is the product ofQuantity Ordered
andPrice Each
. -
Sales Analysis
a. Calculate the total revenue generated by the city.
b. Identify the top 5 products based on total sales revenue.
c. Find the month with the highest sales and plot a graph to show monthly revenue. -
Category and City Analysis
a. Group the data byCategory
and calculate the total revenue for each category.
b. Find the city with the highest number of orders.
c. Plot a bar chart to visualize revenue across different cities. -
Customer Behavior Analysis
a. Identify the customer who spent the most money and calculate their total spending.
b. Find the average order value (AOV) for all customers.
c. Determine the product most frequently purchased by customers. -
Bonus (Optional)
a. Extract the hour from theOrder Date
column and determine the hour with the highest sales.
b. Create a visualization to show the distribution of sales by hour.
Submission Guidelines
- Submit your Python code in a
.py
or.ipynb
file. - Include a brief report (1-2 paragraphs) summarizing your findings.
- Use comments in your code to explain your logic.