diff --git a/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README.md b/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README.md index 999616c5a1702..6b6c3c932be5a 100644 --- a/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README.md +++ b/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README.md @@ -36,7 +36,7 @@ tags:
 输入:grid = [[2,4],[6,8]], x = 2
 输出:4
-解释:可以执行下述操作使所有元素都等于 4 :
+解释:可以执行下述操作使所有元素都等于 4 : 
 - 2 加 x 一次。
 - 6 减 x 一次。
 - 8 减 x 两次。
diff --git a/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README_EN.md b/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README_EN.md
index 69ac6cc530a98..557513f032941 100644
--- a/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README_EN.md	
+++ b/solution/2000-2099/2033.Minimum Operations to Make a Uni-Value Grid/README_EN.md	
@@ -33,7 +33,7 @@ tags:
 
 Input: grid = [[2,4],[6,8]], x = 2
 Output: 4
-Explanation: We can make every element equal to 4 by doing the following:
+Explanation: We can make every element equal to 4 by doing the following: 
 - Add x to 2 once.
 - Subtract x from 6 once.
 - Subtract x from 8 twice.
diff --git a/solution/3400-3499/3497.Analyze Subscription Conversion/README.md b/solution/3400-3499/3497.Analyze Subscription Conversion/README.md
new file mode 100644
index 0000000000000..2318cffe94d0c
--- /dev/null
+++ b/solution/3400-3499/3497.Analyze Subscription Conversion/README.md	
@@ -0,0 +1,220 @@
+---
+comments: true
+difficulty: 中等
+edit_url: https://github.com/doocs/leetcode/edit/main/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README.md
+tags:
+    - 数据库
+---
+
+
+
+# [3497. 分析订阅转化](https://leetcode.cn/problems/analyze-subscription-conversion)
+
+[English Version](/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README_EN.md)
+
+## 题目描述
+
+
+
+

表:UserActivity

+ +
++------------------+---------+
+| Column Name      | Type    | 
++------------------+---------+
+| user_id          | int     |
+| activity_date    | date    |
+| activity_type    | varchar |
+| activity_duration| int     |
++------------------+---------+
+(user_id, activity_date, activity_type) 是这张表的唯一主键。
+activity_type 是('free_trial', 'paid', 'cancelled')中的一个。
+activity_duration 是用户当天在平台上花费的分钟数。
+每一行表示一个用户在特定日期的活动。
+
+ +

订阅服务想要分析用户行为模式。公司提供7天免费试用,试用结束后,用户可以选择订阅 付费计划取消。编写解决方案:

+ +
    +
  1. 查找从免费试用转为付费订阅的用户
  2. +
  3. 计算每位用户在 免费试用 期间的 平均每日活动时长(四舍五入至小数点后 2 位)
  4. +
  5. 计算每位用户在 付费 订阅期间的 平均每日活动时长(四舍五入到小数点后 2 位)
  6. +
+ +

返回结果表以 user_id 升序 排序。

+ +

结果格式如下所示。

+ +

 

+ +

示例:

+ +
+

输入:

+ +

UserActivity 表:

+ +
++---------+---------------+---------------+-------------------+
+| user_id | activity_date | activity_type | activity_duration |
++---------+---------------+---------------+-------------------+
+| 1       | 2023-01-01    | free_trial    | 45                |
+| 1       | 2023-01-02    | free_trial    | 30                |
+| 1       | 2023-01-05    | free_trial    | 60                |
+| 1       | 2023-01-10    | paid          | 75                |
+| 1       | 2023-01-12    | paid          | 90                |
+| 1       | 2023-01-15    | paid          | 65                |
+| 2       | 2023-02-01    | free_trial    | 55                |
+| 2       | 2023-02-03    | free_trial    | 25                |
+| 2       | 2023-02-07    | free_trial    | 50                |
+| 2       | 2023-02-10    | cancelled     | 0                 |
+| 3       | 2023-03-05    | free_trial    | 70                |
+| 3       | 2023-03-06    | free_trial    | 60                |
+| 3       | 2023-03-08    | free_trial    | 80                |
+| 3       | 2023-03-12    | paid          | 50                |
+| 3       | 2023-03-15    | paid          | 55                |
+| 3       | 2023-03-20    | paid          | 85                |
+| 4       | 2023-04-01    | free_trial    | 40                |
+| 4       | 2023-04-03    | free_trial    | 35                |
+| 4       | 2023-04-05    | paid          | 45                |
+| 4       | 2023-04-07    | cancelled     | 0                 |
++---------+---------------+---------------+-------------------+
+
+ +

输出:

+ +
++---------+--------------------+-------------------+
+| user_id | trial_avg_duration | paid_avg_duration |
++---------+--------------------+-------------------+
+| 1       | 45.00              | 76.67             |
+| 3       | 70.00              | 63.33             |
+| 4       | 37.50              | 45.00             |
++---------+--------------------+-------------------+
+
+ +

解释:

+ +
    +
  • 用户 1: + +
      +
    • 体验了 3 天免费试用,时长分别为 45,30 和 60 分钟。
    • +
    • 平均试用时长:(45 + 30 + 60) / 3 = 45.00 分钟。
    • +
    • 拥有 3 天付费订阅,时长分别为 75,90 和 65分钟。
    • +
    • 平均花费市场:(75 + 90 + 65) / 3 = 76.67 分钟。
    • +
    +
  • +
  • 用户 2: +
      +
    • 体验了 3 天免费试用,时长分别为 55,25 和 50 分钟。
    • +
    • 平均试用时长:(55 + 25 + 50) / 3 = 43.33 分钟。
    • +
    • 没有转为付费订阅(只有 free_trial 和 cancelled 活动)。
    • +
    • 未包含在输出中,因为他未转换为付费用户。
    • +
    +
  • +
  • 用户 3: +
      +
    • 体验了 3 天免费试用,时长分别为 70,60 和 80 分钟。
    • +
    • 平均试用时长:(70 + 60 + 80) / 3 = 70.00 分钟。
    • +
    • 拥有 3 天付费订阅,时长分别为 50,55 和 85 分钟。
    • +
    • 平均花费时长:(50 + 55 + 85) / 3 = 63.33 分钟。
    • +
    +
  • +
  • 用户 4: +
      +
    • 体验了 2 天免费试用,时长分别为 40 和 35 分钟。
    • +
    • 平均试用时长:(40 + 35) / 2 = 37.50 分钟。
    • +
    • 在取消前有 1 天的付费订阅,时长为45分钟。
    • +
    • 平均花费时长:45.00 分钟。
    • +
    +
  • + +
+ +

结果表仅包括从免费试用转为付费订阅的用户(用户 1,3 和 4),并且以 user_id 升序排序。

+
+ + + +## 解法 + + + +### 方法一:分组 + 条件筛选 + 等值连接 + +我们首先将表中的数据进行筛选,找出所有 `activity_type` 不等于 `cancelled` 的数据,将数据按照 `user_id` 和 `activity_type` 进行分组,求得每组的时长 `duration`,记录在表 `T` 中。 + +接下来,我们从表 `T` 中筛选出 `activity_type` 为 `free_trial` 和 `paid` 的记录,分别记录在表 `F` 和 `P` 中,最后将这两张表按照 `user_id` 进行等值连接,并按照题目要求筛选出对应的字段并排序,得到最终结果。 + + + +#### MySQL + +```sql +# Write your MySQL query statement below +WITH + T AS ( + SELECT user_id, activity_type, ROUND(SUM(activity_duration) / COUNT(1), 2) duration + FROM UserActivity + WHERE activity_type != 'cancelled' + GROUP BY user_id, activity_type + ), + F AS ( + SELECT user_id, duration trial_avg_duration + FROM T + WHERE activity_type = 'free_trial' + ), + P AS ( + SELECT user_id, duration paid_avg_duration + FROM T + WHERE activity_type = 'paid' + ) +SELECT user_id, trial_avg_duration, paid_avg_duration +FROM + F + JOIN P USING (user_id) +ORDER BY 1; +``` + +#### Pandas + +```python +import pandas as pd + + +def analyze_subscription_conversion(user_activity: pd.DataFrame) -> pd.DataFrame: + df = user_activity[user_activity["activity_type"] != "cancelled"] + + df_grouped = ( + df.groupby(["user_id", "activity_type"])["activity_duration"] + .mean() + .add(0.0001) + .round(2) + .reset_index() + ) + + df_free_trial = ( + df_grouped[df_grouped["activity_type"] == "free_trial"] + .rename(columns={"activity_duration": "trial_avg_duration"}) + .drop(columns=["activity_type"]) + ) + + df_paid = ( + df_grouped[df_grouped["activity_type"] == "paid"] + .rename(columns={"activity_duration": "paid_avg_duration"}) + .drop(columns=["activity_type"]) + ) + + result = df_free_trial.merge(df_paid, on="user_id", how="inner").sort_values( + "user_id" + ) + + return result +``` + + + + + + diff --git a/solution/3400-3499/3497.Analyze Subscription Conversion/README_EN.md b/solution/3400-3499/3497.Analyze Subscription Conversion/README_EN.md new file mode 100644 index 0000000000000..3b84ac5beb349 --- /dev/null +++ b/solution/3400-3499/3497.Analyze Subscription Conversion/README_EN.md @@ -0,0 +1,219 @@ +--- +comments: true +difficulty: Medium +edit_url: https://github.com/doocs/leetcode/edit/main/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README_EN.md +tags: + - Database +--- + + + +# [3497. Analyze Subscription Conversion](https://leetcode.com/problems/analyze-subscription-conversion) + +[中文文档](/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README.md) + +## Description + + + +

Table: UserActivity

+ +
++------------------+---------+
+| Column Name      | Type    | 
++------------------+---------+
+| user_id          | int     |
+| activity_date    | date    |
+| activity_type    | varchar |
+| activity_duration| int     |
++------------------+---------+
+(user_id, activity_date, activity_type) is the unique key for this table.
+activity_type is one of ('free_trial', 'paid', 'cancelled').
+activity_duration is the number of minutes the user spent on the platform that day.
+Each row represents a user's activity on a specific date.
+
+ +

A subscription service wants to analyze user behavior patterns. The company offers a 7-day free trial, after which users can subscribe to a paid plan or cancel. Write a solution to:

+ +
    +
  1. Find users who converted from free trial to paid subscription
  2. +
  3. Calculate each user's average daily activity duration during their free trial period (rounded to 2 decimal places)
  4. +
  5. Calculate each user's average daily activity duration during their paid subscription period (rounded to 2 decimal places)
  6. +
+ +

Return the result table ordered by user_id in ascending order.

+ +

The result format is in the following example.

+ +

 

+

Example:

+ +
+

Input:

+ +

UserActivity table:

+ +
++---------+---------------+---------------+-------------------+
+| user_id | activity_date | activity_type | activity_duration |
++---------+---------------+---------------+-------------------+
+| 1       | 2023-01-01    | free_trial    | 45                |
+| 1       | 2023-01-02    | free_trial    | 30                |
+| 1       | 2023-01-05    | free_trial    | 60                |
+| 1       | 2023-01-10    | paid          | 75                |
+| 1       | 2023-01-12    | paid          | 90                |
+| 1       | 2023-01-15    | paid          | 65                |
+| 2       | 2023-02-01    | free_trial    | 55                |
+| 2       | 2023-02-03    | free_trial    | 25                |
+| 2       | 2023-02-07    | free_trial    | 50                |
+| 2       | 2023-02-10    | cancelled     | 0                 |
+| 3       | 2023-03-05    | free_trial    | 70                |
+| 3       | 2023-03-06    | free_trial    | 60                |
+| 3       | 2023-03-08    | free_trial    | 80                |
+| 3       | 2023-03-12    | paid          | 50                |
+| 3       | 2023-03-15    | paid          | 55                |
+| 3       | 2023-03-20    | paid          | 85                |
+| 4       | 2023-04-01    | free_trial    | 40                |
+| 4       | 2023-04-03    | free_trial    | 35                |
+| 4       | 2023-04-05    | paid          | 45                |
+| 4       | 2023-04-07    | cancelled     | 0                 |
++---------+---------------+---------------+-------------------+
+
+ +

Output:

+ +
++---------+--------------------+-------------------+
+| user_id | trial_avg_duration | paid_avg_duration |
++---------+--------------------+-------------------+
+| 1       | 45.00              | 76.67             |
+| 3       | 70.00              | 63.33             |
+| 4       | 37.50              | 45.00             |
++---------+--------------------+-------------------+
+
+ +

Explanation:

+ +
    +
  • User 1: + +
      +
    • Had 3 days of free trial with durations of 45, 30, and 60 minutes.
    • +
    • Average trial duration: (45 + 30 + 60) / 3 = 45.00 minutes.
    • +
    • Had 3 days of paid subscription with durations of 75, 90, and 65 minutes.
    • +
    • Average paid duration: (75 + 90 + 65) / 3 = 76.67 minutes.
    • +
    +
  • +
  • User 2: +
      +
    • Had 3 days of free trial with durations of 55, 25, and 50 minutes.
    • +
    • Average trial duration: (55 + 25 + 50) / 3 = 43.33 minutes.
    • +
    • Did not convert to a paid subscription (only had free_trial and cancelled activities).
    • +
    • Not included in the output because they didn't convert to paid.
    • +
    +
  • +
  • User 3: +
      +
    • Had 3 days of free trial with durations of 70, 60, and 80 minutes.
    • +
    • Average trial duration: (70 + 60 + 80) / 3 = 70.00 minutes.
    • +
    • Had 3 days of paid subscription with durations of 50, 55, and 85 minutes.
    • +
    • Average paid duration: (50 + 55 + 85) / 3 = 63.33 minutes.
    • +
    +
  • +
  • User 4: +
      +
    • Had 2 days of free trial with durations of 40 and 35 minutes.
    • +
    • Average trial duration: (40 + 35) / 2 = 37.50 minutes.
    • +
    • Had 1 day of paid subscription with duration of 45 minutes before cancelling.
    • +
    • Average paid duration: 45.00 minutes.
    • +
    +
  • + +
+ +

The result table only includes users who converted from free trial to paid subscription (users 1, 3, and 4), and is ordered by user_id in ascending order.

+
+ + + +## Solutions + + + +### Solution 1: Grouping + Conditional Filtering + Equi-Join + +First, we filter the data in the table to exclude all records where `activity_type` is equal to `cancelled`. Then, we group the remaining data by `user_id` and `activity_type`, calculate the duration `duration` for each group, and store the results in table `T`. + +Next, we filter table `T` to extract records where `activity_type` is `free_trial` and `paid`, storing them in tables `F` and `P`, respectively. Finally, we perform an equi-join on these two tables using `user_id`, filter the required fields as per the problem statement, and sort the results to produce the final output. + + + +#### MySQL + +```sql +# Write your MySQL query statement below +WITH + T AS ( + SELECT user_id, activity_type, ROUND(SUM(activity_duration) / COUNT(1), 2) duration + FROM UserActivity + WHERE activity_type != 'cancelled' + GROUP BY user_id, activity_type + ), + F AS ( + SELECT user_id, duration trial_avg_duration + FROM T + WHERE activity_type = 'free_trial' + ), + P AS ( + SELECT user_id, duration paid_avg_duration + FROM T + WHERE activity_type = 'paid' + ) +SELECT user_id, trial_avg_duration, paid_avg_duration +FROM + F + JOIN P USING (user_id) +ORDER BY 1; +``` + +#### Pandas + +```python +import pandas as pd + + +def analyze_subscription_conversion(user_activity: pd.DataFrame) -> pd.DataFrame: + df = user_activity[user_activity["activity_type"] != "cancelled"] + + df_grouped = ( + df.groupby(["user_id", "activity_type"])["activity_duration"] + .mean() + .add(0.0001) + .round(2) + .reset_index() + ) + + df_free_trial = ( + df_grouped[df_grouped["activity_type"] == "free_trial"] + .rename(columns={"activity_duration": "trial_avg_duration"}) + .drop(columns=["activity_type"]) + ) + + df_paid = ( + df_grouped[df_grouped["activity_type"] == "paid"] + .rename(columns={"activity_duration": "paid_avg_duration"}) + .drop(columns=["activity_type"]) + ) + + result = df_free_trial.merge(df_paid, on="user_id", how="inner").sort_values( + "user_id" + ) + + return result +``` + + + + + + diff --git a/solution/3400-3499/3497.Analyze Subscription Conversion/Solution.py b/solution/3400-3499/3497.Analyze Subscription Conversion/Solution.py new file mode 100644 index 0000000000000..bb7de1f7e0247 --- /dev/null +++ b/solution/3400-3499/3497.Analyze Subscription Conversion/Solution.py @@ -0,0 +1,31 @@ +import pandas as pd + + +def analyze_subscription_conversion(user_activity: pd.DataFrame) -> pd.DataFrame: + df = user_activity[user_activity["activity_type"] != "cancelled"] + + df_grouped = ( + df.groupby(["user_id", "activity_type"])["activity_duration"] + .mean() + .add(0.0001) + .round(2) + .reset_index() + ) + + df_free_trial = ( + df_grouped[df_grouped["activity_type"] == "free_trial"] + .rename(columns={"activity_duration": "trial_avg_duration"}) + .drop(columns=["activity_type"]) + ) + + df_paid = ( + df_grouped[df_grouped["activity_type"] == "paid"] + .rename(columns={"activity_duration": "paid_avg_duration"}) + .drop(columns=["activity_type"]) + ) + + result = df_free_trial.merge(df_paid, on="user_id", how="inner").sort_values( + "user_id" + ) + + return result diff --git a/solution/3400-3499/3497.Analyze Subscription Conversion/Solution.sql b/solution/3400-3499/3497.Analyze Subscription Conversion/Solution.sql new file mode 100644 index 0000000000000..0de6666e94df7 --- /dev/null +++ b/solution/3400-3499/3497.Analyze Subscription Conversion/Solution.sql @@ -0,0 +1,23 @@ +# Write your MySQL query statement below +WITH + T AS ( + SELECT user_id, activity_type, ROUND(SUM(activity_duration) / COUNT(1), 2) duration + FROM UserActivity + WHERE activity_type != 'cancelled' + GROUP BY user_id, activity_type + ), + F AS ( + SELECT user_id, duration trial_avg_duration + FROM T + WHERE activity_type = 'free_trial' + ), + P AS ( + SELECT user_id, duration paid_avg_duration + FROM T + WHERE activity_type = 'paid' + ) +SELECT user_id, trial_avg_duration, paid_avg_duration +FROM + F + JOIN P USING (user_id) +ORDER BY 1; diff --git a/solution/DATABASE_README.md b/solution/DATABASE_README.md index 287497d6574af..6d84fcdfa02dc 100644 --- a/solution/DATABASE_README.md +++ b/solution/DATABASE_README.md @@ -313,6 +313,7 @@ | 3465 | [查找具有有效序列号的产品](/solution/3400-3499/3465.Find%20Products%20with%20Valid%20Serial%20Numbers/README.md) | `数据库` | 简单 | | | 3475 | [DNA 模式识别](/solution/3400-3499/3475.DNA%20Pattern%20Recognition/README.md) | | 中等 | | | 3482 | [分析组织层级](/solution/3400-3499/3482.Analyze%20Organization%20Hierarchy/README.md) | `数据库` | 困难 | | +| 3497 | [分析订阅转化](/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README.md) | | 中等 | | ## 版权 diff --git a/solution/DATABASE_README_EN.md b/solution/DATABASE_README_EN.md index fce0e86019893..e2e8dff2e2d8e 100644 --- a/solution/DATABASE_README_EN.md +++ b/solution/DATABASE_README_EN.md @@ -311,6 +311,7 @@ Press Control + F(or Command + F on | 3465 | [Find Products with Valid Serial Numbers](/solution/3400-3499/3465.Find%20Products%20with%20Valid%20Serial%20Numbers/README_EN.md) | `Database` | Easy | | | 3475 | [DNA Pattern Recognition](/solution/3400-3499/3475.DNA%20Pattern%20Recognition/README_EN.md) | | Medium | | | 3482 | [Analyze Organization Hierarchy](/solution/3400-3499/3482.Analyze%20Organization%20Hierarchy/README_EN.md) | `Database` | Hard | | +| 3497 | [Analyze Subscription Conversion](/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README_EN.md) | | Medium | | ## Copyright diff --git a/solution/README.md b/solution/README.md index 23f5850dc2508..0c768406719c6 100644 --- a/solution/README.md +++ b/solution/README.md @@ -3507,6 +3507,7 @@ | 3494 | [酿造药水需要的最少总时间](/solution/3400-3499/3494.Find%20the%20Minimum%20Amount%20of%20Time%20to%20Brew%20Potions/README.md) | `数组`,`前缀和`,`模拟` | 中等 | 第 442 场周赛 | | 3495 | [使数组元素都变为零的最少操作次数](/solution/3400-3499/3495.Minimum%20Operations%20to%20Make%20Array%20Elements%20Zero/README.md) | `位运算`,`数组`,`数学` | 困难 | 第 442 场周赛 | | 3496 | [最大化配对删除后的得分](/solution/3400-3499/3496.Maximize%20Score%20After%20Pair%20Deletions/README.md) | | 中等 | 🔒 | +| 3497 | [分析订阅转化](/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README.md) | | 中等 | | ## 版权 diff --git a/solution/README_EN.md b/solution/README_EN.md index 1eb98bfc53173..9b14fd4ca08df 100644 --- a/solution/README_EN.md +++ b/solution/README_EN.md @@ -3505,6 +3505,7 @@ Press Control + F(or Command + F on | 3494 | [Find the Minimum Amount of Time to Brew Potions](/solution/3400-3499/3494.Find%20the%20Minimum%20Amount%20of%20Time%20to%20Brew%20Potions/README_EN.md) | `Array`,`Prefix Sum`,`Simulation` | Medium | Weekly Contest 442 | | 3495 | [Minimum Operations to Make Array Elements Zero](/solution/3400-3499/3495.Minimum%20Operations%20to%20Make%20Array%20Elements%20Zero/README_EN.md) | `Bit Manipulation`,`Array`,`Math` | Hard | Weekly Contest 442 | | 3496 | [Maximize Score After Pair Deletions](/solution/3400-3499/3496.Maximize%20Score%20After%20Pair%20Deletions/README_EN.md) | | Medium | 🔒 | +| 3497 | [Analyze Subscription Conversion](/solution/3400-3499/3497.Analyze%20Subscription%20Conversion/README_EN.md) | | Medium | | ## Copyright