Skip to content

Comprehensive analysis of BTCUSDT perpetual trades data #424

@terbed

Description

@terbed

Hello there!

I've analyzed the BTCUSDT PERP raw monthly trades data from 2020 to 2025-05-31. I examined the id continuity and marked any with considerable gaps (larger than 1 minute). Please be aware that there are significant gaps:

Found 9 months with data integrity issues:

===================================================================
Month: 2021-02
=====================================================================
Missing data: 0.04%
First timestamp: 2021-02-01 00:00:00.867000
Last timestamp: 2021-02-28 23:59:59.994000
Number of discontinuities: 2

Discontinuities:
                                            0                           1
start_id                            517439800                   524621349
end_id                              517446365                   524632412
missing_ids                              6564                       11062
pre_gap_time_str   2021-02-20 23:56:32.855000  2021-02-22 23:54:24.405000
post_gap_time_str  2021-02-21 00:00:00.002000  2021-02-23 00:00:00.011000
time_interval_str      0 days 00:03:27.147000      0 days 00:05:35.606000

===================================================================
Month: 2022-02
=====================================================================
Missing data: 2.26%
First timestamp: 2022-02-01 00:00:00.036000
Last timestamp: 2022-02-28 23:59:59.970000
Number of discontinuities: 1

Discontinuities:
                                            0
start_id                           1928578020
end_id                             1930873349
missing_ids                           2295328
pre_gap_time_str   2022-02-14 07:28:04.552000
post_gap_time_str  2022-02-15 00:00:00.073000
time_interval_str      0 days 16:31:55.521000

===================================================================
Month: 2022-04
=====================================================================
Missing data: 0.02%
First timestamp: 2022-04-01 00:00:00.106000
Last timestamp: 2022-04-30 23:59:59.972000
-> Duplicated trades found.

===================================================================
Month: 2022-09
=====================================================================
Missing data: 0.26%
First timestamp: 2022-09-01 00:00:00.046000
Last timestamp: 2022-09-30 23:59:59.991000
Number of discontinuities: 1

Discontinuities:
                                            0
start_id                           2863401859
end_id                             2863691566
missing_ids                            289706
pre_gap_time_str   2022-09-21 22:42:40.073000
post_gap_time_str         2022-09-22 00:00:00
time_interval_str      0 days 01:17:19.927000

===================================================================
Month: 2022-10
=====================================================================
Missing data: 0.13%
First timestamp: 2022-10-01 00:00:00.077000
Last timestamp: 2022-10-31 23:59:59.953000
Number of discontinuities: 1

Discontinuities:
                                            0
start_id                           2985355209
end_id                             2985456075
missing_ids                            100865
pre_gap_time_str   2022-10-25 22:52:24.171000
post_gap_time_str  2022-10-26 00:00:00.140000
time_interval_str      0 days 01:07:35.969000

===================================================================
Month: 2022-11
=====================================================================
Missing data: 0.06%
First timestamp: 2022-11-01 00:00:00.079000
Last timestamp: 2022-11-30 23:59:59.954000
Number of discontinuities: 1

Discontinuities:
                                            0
start_id                           3010173658
end_id                             3010191331
missing_ids                             17672
pre_gap_time_str   2022-11-02 23:36:02.056000
post_gap_time_str  2022-11-03 00:00:00.120000
time_interval_str      0 days 00:23:58.064000

===================================================================
Month: 2023-02
=====================================================================
Missing data: 0.02%
First timestamp: 2023-02-01 00:00:03.361000
Last timestamp: 2023-02-28 23:59:59.951000
Number of discontinuities: 1

Discontinuities:
                                            0
start_id                           3252878318
end_id                             3252889108
missing_ids                             10789
pre_gap_time_str   2023-02-01 23:52:24.550000
post_gap_time_str  2023-02-02 00:00:00.006000
time_interval_str      0 days 00:07:35.456000

===================================================================
Month: 2023-03
=====================================================================
Missing data: 0.59%
First timestamp: 2023-03-01 00:00:00.053000
Last timestamp: 2023-03-31 23:59:59.974000
Number of discontinuities: 3

Discontinuities:
                                            0                           1  \
start_id                           3403838556                  3414885362   
end_id                             3403892686                  3415203953   
missing_ids                             54129                      318590   
pre_gap_time_str   2023-03-13 23:42:42.232000  2023-03-14 22:33:51.892000   
post_gap_time_str  2023-03-14 00:00:00.009000  2023-03-15 00:00:00.042000   
time_interval_str      0 days 00:17:17.777000      0 days 01:26:08.150000   

                                            2  
start_id                           3422922336  
end_id                             3423506983  
missing_ids                            584646  
pre_gap_time_str   2023-03-15 21:01:12.904000  
post_gap_time_str  2023-03-16 00:00:00.139000  
time_interval_str      0 days 02:58:47.235000  

===================================================================
Month: 2023-11
=====================================================================
Missing data: 0.51%
First timestamp: 2023-11-01 00:00:05.027000
Last timestamp: 2023-11-30 23:59:59.558000
Number of discontinuities: 2

Discontinuities:
                                            0                           1
start_id                           4308908463                  4309400061
end_id                             4309326419                  4309407327
missing_ids                            417955                        7265
pre_gap_time_str   2023-11-21 02:43:24.004000  2023-11-21 10:29:53.003000
post_gap_time_str  2023-11-21 08:51:39.707000  2023-11-21 10:38:48.127000
time_interval_str      0 days 06:08:15.703000      0 days 00:08:55.124000

Conducting another analysis based on timestamp discontinuity gives the following result for greater gaps than 1 minute:

{'/trades/2020-01': [],
 '/trades/2020-02': [],
 '/trades/2020-03': [],
 '/trades/2020-04': [(Timestamp('2020-04-21 17:23:02.498000'),
   Timedelta('0 days 00:01:00.127000'))],
 '/trades/2020-05': [],
 '/trades/2020-06': [],
 '/trades/2020-07': [],
 '/trades/2020-08': [],
 '/trades/2020-09': [(Timestamp('2020-09-27 11:15:29.047000'),
   Timedelta('0 days 00:03:06.480000')),
  (Timestamp('2020-09-27 11:19:29.322000'),
   Timedelta('0 days 00:01:04.580000')),
  (Timestamp('2020-09-27 11:23:55.939000'),
   Timedelta('0 days 00:01:16.696000')),
  (Timestamp('2020-09-27 11:27:22.012000'),
   Timedelta('0 days 00:01:01.368000')),
  (Timestamp('2020-09-27 11:29:48.269000'),
   Timedelta('0 days 00:01:07.384000'))],
 '/trades/2020-10': [],
 '/trades/2020-11': [],
 '/trades/2020-12': [],
 '/trades/2021-01': [],
 '/trades/2021-02': [(Timestamp('2021-02-21 00:00:00.002000'),
   Timedelta('0 days 00:03:27.147000')),
  (Timestamp('2021-02-23 00:00:00.011000'),
   Timedelta('0 days 00:05:35.606000'))],
 '/trades/2021-03': [(Timestamp('2021-03-02 02:00:00.012000'),
   Timedelta('0 days 00:59:51.398000'))],
 '/trades/2021-04': [],
 '/trades/2021-05': [],
 '/trades/2021-06': [],
 '/trades/2021-07': [],
 '/trades/2021-08': [],
 '/trades/2021-09': [],
 '/trades/2021-10': [],
 '/trades/2021-11': [],
 '/trades/2021-12': [],
 '/trades/2022-01': [],
 '/trades/2022-02': [(Timestamp('2022-02-15 00:00:00.073000'),
   Timedelta('0 days 16:31:55.521000'))],
 '/trades/2022-03': [],
 '/trades/2022-04': [],
 '/trades/2022-05': [(Timestamp('2022-05-01 22:55:10.707000'),
   Timedelta('0 days 00:29:33.304000')),
  (Timestamp('2022-05-28 17:15:00.492000'),
   Timedelta('0 days 00:35:23.272000'))],
 '/trades/2022-06': [],
 '/trades/2022-07': [],
 '/trades/2022-08': [],
 '/trades/2022-09': [(Timestamp('2022-09-22 00:00:00'),
   Timedelta('0 days 01:17:19.927000'))],
 '/trades/2022-10': [(Timestamp('2022-10-26 00:00:00.140000'),
   Timedelta('0 days 01:07:35.969000'))],
 '/trades/2022-11': [(Timestamp('2022-11-03 00:00:00.120000'),
   Timedelta('0 days 00:23:58.064000'))],
 '/trades/2022-12': [],
 '/trades/2023-01': [],
 '/trades/2023-02': [(Timestamp('2023-02-02 00:00:00.006000'),
   Timedelta('0 days 00:07:35.456000'))],
 '/trades/2023-03': [(Timestamp('2023-03-14 00:00:00.009000'),
   Timedelta('0 days 00:17:17.777000')),
  (Timestamp('2023-03-15 00:00:00.042000'),
   Timedelta('0 days 01:26:08.150000')),
  (Timestamp('2023-03-16 00:00:00.139000'),
   Timedelta('0 days 02:58:47.235000'))],
 '/trades/2023-04': [],
 '/trades/2023-05': [],
 '/trades/2023-06': [],
 '/trades/2023-07': [],
 '/trades/2023-08': [],
 '/trades/2023-09': [(Timestamp('2023-09-12 08:53:39.137000'),
   Timedelta('0 days 00:19:50.856000'))],
 '/trades/2023-10': [],
 '/trades/2023-11': [(Timestamp('2023-11-21 08:51:39.707000'),
   Timedelta('0 days 06:08:15.703000')),
  (Timestamp('2023-11-21 10:38:48.127000'),
   Timedelta('0 days 00:08:55.124000'))],
 '/trades/2023-12': [],
 '/trades/2024-01': [],
 '/trades/2024-02': [],
 '/trades/2024-03': [],
 '/trades/2024-04': [],
 '/trades/2024-05': [],
 '/trades/2024-06': [],
 '/trades/2024-07': [],
 '/trades/2024-08': [],
 '/trades/2024-09': [],
 '/trades/2024-10': [(Timestamp('2024-10-28 16:35:13.575000'),
   Timedelta('0 days 00:14:30.392000')),
  (Timestamp('2024-10-28 16:37:07.597000'),
   Timedelta('0 days 00:01:22.471000'))],
 '/trades/2024-11': [],
 '/trades/2024-12': [],
 '/trades/2025-01': [],
 '/trades/2025-02': [],
 '/trades/2025-03': [],
 '/trades/2025-04': [],
 '/trades/2025-05': []}

The latter results are more concerning, as they show more discontinuities than the ID-based approach. This means that there is a temporal trade discontinuity, while trades are not missing.

P.s.:
I've validated the downloaded data with checksums.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions