-
Notifications
You must be signed in to change notification settings - Fork 44
Description
📌 Case Study: Credit Card Transactions Analysis
Objective:
This case study will help you develop expertise in data cleaning, manipulation, analysis, and visualization using Pandas. You will analyze a dataset of credit card transactions to detect trends, find anomalies, and extract insights that could help in fraud detection and customer behavior analysis.
📂 Problem Statement:
You work as a data analyst for a financial company that provides credit cards to customers. Your task is to analyze credit card transactions and provide insights that can help in decision-making.
You have been provided with a dataset named credit_card_transactions.csv
, which contains details about transactions made by various customers.
📊 Dataset Overview (credit_card_transactions.csv
)
Download Dataset
Column Name | Data Type | Description |
---|---|---|
Transaction_ID | int | Unique transaction identifier |
Customer_ID | int | Unique customer identifier |
Transaction_Date | str | Date of transaction (YYYY-MM-DD) |
Transaction_Type | str | Type of transaction (Online, POS, ATM, etc.) |
Merchant | str | Name of the merchant |
Category | str | Purchase category (Groceries, Electronics, Travel, etc.) |
Amount | float | Transaction amount (in USD) |
Payment_Mode | str | Payment method (Credit Card, Debit Card, etc.) |
Transaction_Status | str | Status (Approved, Declined, Pending) |
Location | str | City where transaction occurred |
📝 Assignment Questions:
1️⃣ Data Exploration and Cleaning
- Load the dataset into a Pandas DataFrame and display the first 5 rows.
- Check the shape, column names, and summary statistics of the dataset.
- Identify and handle missing values (fill or drop based on the data type).
- Convert
Transaction_Date
into datetime format and extract year, month, and day as new columns.
2️⃣ Data Selection and Indexing
- Retrieve all transactions made in January 2024.
- Find transactions where Amount > 1000 and Transaction_Type is
"Online"
. - Select only Approved transactions from the dataset.
3️⃣ Data Manipulation and Feature Engineering
- Create a new column
Discounted_Amount
, assuming a 5% discount on all transactions above$500
. - Categorize the Transaction_Amount into
Low
,Medium
, andHigh
based on:Low
: Below$100
Medium
: Between$100 - $500
High
: Above$500
- Drop the
Merchant
column if more than 30% of values are missing.
4️⃣ Aggregation and Insights
- Find the total transaction amount per
Category
. - Determine the number of declined transactions per
Payment_Mode
. - Identify the top 5 most frequent merchants based on transaction count.
- Find the average transaction amount per
Location
.
5️⃣ Fraud Detection Indicators
- Find customers who made more than 10 transactions in a single day (potential fraud).
- Identify transactions that have the same Customer_ID but occurred in different locations within 5 minutes.
- Find transactions where Amount > $5000 and Transaction_Type is Online (flag as high-risk).
6️⃣ Data Merging and Joining
- Suppose you have another dataset (
customer_info.csv
) containingCustomer_ID
,Age
,Gender
, andAccount_Status
.- Merge it with
credit_card_transactions.csv
using an appropriate join operation. - Find the average transaction amount per
Age
group.
- Merge it with
📊 Bonus Challenge (Optional)
- Create a bar chart showing the total transaction amount per
Category
using Matplotlib or Seaborn. - Generate a heatmap showing correlation between Amount, Transaction_Status, and Payment_Mode.
📌 Submission Guidelines:
- Write your Pandas code with comments explaining each step.
- Provide brief answers to descriptive questions.
- Submit your Jupyter Notebook (.ipynb) or Python script (.py) along with a short summary report (if needed).
✅ Evaluation Criteria:
✔ Correctness: Ensure your solutions are accurate and functional.
✔ Code Readability: Use proper variable names and comments.
✔ Data Handling: Efficiently handle missing values and incorrect data types.
✔ Fraud Detection & Insights: Extract meaningful business insights.
🚀 Good luck! Happy Coding!