You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add QuickSight dashboard, improve propensity signal, update README use case section
- Add scripts/create_dashboard.py: fully automated QuickSight dashboard (4 sheets)
with Athena/Glue data layer, DIRECT_QUERY datasets, chart subtitles
- Update inference_handler.py: pass through contextual columns (campaign, device,
category, purchase_amount, impressions, clicks) for dashboard segmentation
- Update generate_synthetic_data.py: embed campaign and category propensity
multipliers so dashboard charts show meaningful variation across segments
- Update undeploy/undeploy.py: add QuickSight resource cleanup (dashboard,
analysis, dataset, datasource, IAM inline policy)
- Update config.py: add QS_NOTIFICATION_EMAIL, make hardcoded values authoritative
over env vars, improve validate() with require_qs_email flag
- Update README: rewrite opening and use case section to lead with business value;
add Step 6 (QuickSight dashboard) to timing table and project structure;
update output format table to reflect passthrough columns; add QuickSight
region prerequisite note
Self-contained, reusable demo for **Customer Propensity Scoring** using AWS Clean Rooms ML with custom training and inference containers.
5
+
Self-contained, reusable, and customizable demo showing how an **advertiser** and a **retailer** can jointly predict which customers are most likely to make a purchase — without either party ever sharing their raw data with the other.
6
+
7
+
The advertiser contributes **ad engagement data** (impressions, clicks, time spent, device type, campaign) and the retailer contributes **purchase behavior data** (product categories, purchase amounts, site visits, conversion history). AWS Clean Rooms ML joins these datasets inside a secure collaboration, trains a propensity model on the combined signal, and scores every customer — all without exposing either party's underlying records.
8
+
9
+
The output is a ranked list of customers by purchase propensity, visualized in an Amazon QuickSight dashboard that shows which campaigns, categories, and segments drive the highest conversion intent.
6
10
7
11
This repo is a sample, to quickly get started with AWS Clean Rooms Custom ML models analysis; it's not meant for production usage AS-IS.
8
12
@@ -32,11 +36,15 @@ This repo is a sample, to quickly get started with AWS Clean Rooms Custom ML mod
32
36
33
37
## Use Case: Customer Propensity Scoring
34
38
35
-
**Scenario:** An advertiser and a retailer want to collaborate on predicting which customers are most likely to convert (make a purchase) based on combined ad engagement and purchase behavior data. Neither party wants to share their raw data with the other.
39
+
An **advertiser** knows which users engaged with their ads — but not whether those users actually bought anything. A **retailer** knows which users purchased — but not which ads influenced them. Neither party is willing to share their raw customer data with the other.
40
+
41
+
By combining both datasets inside an AWS Clean Rooms collaboration, the model learns from the full picture: ad engagement signals from the advertiser and purchase behavior signals from the retailer. The result is a propensity score for every customer that neither party could have produced alone.
42
+
43
+
**What the advertiser gains:** a ranked list of users to prioritise for ad targeting, based on actual purchase signals — not just clicks.
36
44
37
-
**Solution:**AWS Clean Rooms ML enables both parties to contribute their data to a secure collaboration. AWS Clean Rooms joins the datasets on a shared key (`user_id`), trains a propensity model on the combined features, and runs inference — all without either party seeing the other's raw data.
45
+
**What the retailer gains:**insight into which ad-exposed customers are most likely to buy, enabling smarter inventory planning and personalised offers.
38
46
39
-
**Business Value:**The advertiser can identify high-propensity users to target with ad campaigns, while the retailer gains insight into which ad-exposed customers are most likely to purchase — enabling better ad spend allocation and personalized marketing.
47
+
**What neither party gives up:**their raw customer data. AWS Clean Rooms enforces that the join happens inside the secure collaboration — no raw records cross the boundary.
40
48
41
49
---
42
50
@@ -51,6 +59,7 @@ This repo is a sample, to quickly get started with AWS Clean Rooms Custom ML mod
| 6 | Create QuickSight Dashboard (optional) |~3 min |
54
63
|**Total**|**End-to-end**|**~42 min**|
55
64
56
65
### Prerequisites
@@ -59,13 +68,16 @@ This repo is a sample, to quickly get started with AWS Clean Rooms Custom ML mod
59
68
- AWS CLI configured with valid credentials
60
69
- AWS account with AWS Clean Rooms ML access enabled
61
70
71
+
> **Optional — QuickSight Dashboard (Step 6):** If you plan to run `scripts/create_dashboard.py`, your `AWS_REGION` must be a region where Amazon QuickSight is available. QuickSight, Athena, Glue, and S3 must all be in the same region — cross-region Athena connections are not supported by QuickSight. Supported regions include `us-east-1`, `us-east-2`, `us-west-2`, `eu-west-1`, `eu-west-2`, `eu-west-3`, `eu-central-1`, `eu-north-1`, `ap-northeast-1`, `ap-northeast-2`, `ap-southeast-1`, `ap-southeast-2`, `ap-south-1`, `ca-central-1`, and others. See the [full list](https://docs.aws.amazon.com/quicksight/latest/user/regions-qs.html). Also set `QS_NOTIFICATION_EMAIL` in `config.py` to a valid email address — this is used for QuickSight account registration and is validated at script startup.
72
+
62
73
### Step 0: Configure Your Account
63
74
64
75
Edit `config.py` and set your values:
65
76
66
77
```python
67
-
AWS_ACCOUNT_ID="123456789012"# Your 12-digit AWS account ID
68
-
AWS_REGION="eu-north-1"# Your preferred region
78
+
AWS_ACCOUNT_ID="123456789012"# Your 12-digit AWS account ID
79
+
AWS_REGION="eu-north-1"# Your preferred region
80
+
QS_NOTIFICATION_EMAIL="your@email.com"# Optional: only needed for Step 6 (QuickSight)
69
81
```
70
82
71
83
All scripts read from this single file — no other hardcoded values to change.
@@ -416,6 +428,14 @@ After successful inference, AWS Clean Rooms ML writes the output to the configur
416
428
|--------|------|-------------|
417
429
| propensity_score | float (0–1) | Predicted probability of conversion |
| purchase_amount | float | Total purchase amount |
435
+
| impressions | int | Number of ad impressions |
436
+
| clicks | int | Number of ad clicks |
437
+
438
+
> **Note:**`user_id` is never present in the output — it is the Clean Rooms join key and is excluded from the ML input channel by design. The passthrough contextual columns (`ad_campaign_id`, `device_type`, etc.) come from the pre-joined data already approved for the inference channel and are used to power the QuickSight dashboard segmentation.
419
439
420
440
Example output rows:
421
441
@@ -478,6 +498,7 @@ scripts/
478
498
build_and_push.py ← Build containers via local Docker
479
499
setup_cleanrooms.py ← Create Glue, IAM, collaboration, ML config
480
500
run_cleanrooms_ml.py ← Create channels, train model, run inference
test_training_local.py ← Test training locally (no AWS needed)
482
503
sagemaker_training_job.py ← Optional: run training via SageMaker directly
483
504
update_requirements.sh ← Regenerate container requirements.txt from lockfile
@@ -612,12 +633,13 @@ The undeploy script removes all resources in reverse dependency order:
612
633
613
634
1.**Clean Rooms ML** — inference jobs, trained models, ML input channels, algorithm associations, configured model algorithms
614
635
2.**Clean Rooms** — ML configuration, table association analysis rules, table associations, configured tables, analysis rules, collaboration
615
-
3.**AWS Glue** — tables and database (`cleanrooms_ml_demo`)
636
+
3.**AWS Glue** — tables and database (`cleanrooms_ml_demo`), including dashboard tables (`inference_output`, `model_metrics`, `feature_importance`) if `create_dashboard.py` was run
616
637
4.**Lake Formation** — permission grants for the data provider role
617
-
5.**Amazon S3** — source and output buckets (empties all objects and versions first)
638
+
5.**Amazon S3** — source and output buckets (empties all objects and versions first, including `dashboard-data/` CSVs)
618
639
6.**Amazon ECR** — training and inference container repositories (including all images)
8.**CodeBuild** — build project and associated CloudWatch log groups
642
+
9.**Amazon QuickSight** — dashboard, analysis, SPICE datasets, and Athena data source (if `create_dashboard.py` was run). The QuickSight account subscription itself is **not** deleted as it is account-wide.
621
643
622
644
> **Note:** IAM roles are global (not region-scoped), so they only need to be deleted once regardless of how many regions were used. The script handles this gracefully — if a role was already deleted by a previous region's undeploy run, it skips it.
0 commit comments