-
Notifications
You must be signed in to change notification settings - Fork 554
Open
Description
Hello there!
I've analyzed the BTCUSDT PERP
raw monthly trades data from 2020
to 2025-05-31
. I examined the id
continuity and marked any with considerable gaps (larger than 1 minute). Please be aware that there are significant gaps:
Found 9 months with data integrity issues:
===================================================================
Month: 2021-02
=====================================================================
Missing data: 0.04%
First timestamp: 2021-02-01 00:00:00.867000
Last timestamp: 2021-02-28 23:59:59.994000
Number of discontinuities: 2
Discontinuities:
0 1
start_id 517439800 524621349
end_id 517446365 524632412
missing_ids 6564 11062
pre_gap_time_str 2021-02-20 23:56:32.855000 2021-02-22 23:54:24.405000
post_gap_time_str 2021-02-21 00:00:00.002000 2021-02-23 00:00:00.011000
time_interval_str 0 days 00:03:27.147000 0 days 00:05:35.606000
===================================================================
Month: 2022-02
=====================================================================
Missing data: 2.26%
First timestamp: 2022-02-01 00:00:00.036000
Last timestamp: 2022-02-28 23:59:59.970000
Number of discontinuities: 1
Discontinuities:
0
start_id 1928578020
end_id 1930873349
missing_ids 2295328
pre_gap_time_str 2022-02-14 07:28:04.552000
post_gap_time_str 2022-02-15 00:00:00.073000
time_interval_str 0 days 16:31:55.521000
===================================================================
Month: 2022-04
=====================================================================
Missing data: 0.02%
First timestamp: 2022-04-01 00:00:00.106000
Last timestamp: 2022-04-30 23:59:59.972000
-> Duplicated trades found.
===================================================================
Month: 2022-09
=====================================================================
Missing data: 0.26%
First timestamp: 2022-09-01 00:00:00.046000
Last timestamp: 2022-09-30 23:59:59.991000
Number of discontinuities: 1
Discontinuities:
0
start_id 2863401859
end_id 2863691566
missing_ids 289706
pre_gap_time_str 2022-09-21 22:42:40.073000
post_gap_time_str 2022-09-22 00:00:00
time_interval_str 0 days 01:17:19.927000
===================================================================
Month: 2022-10
=====================================================================
Missing data: 0.13%
First timestamp: 2022-10-01 00:00:00.077000
Last timestamp: 2022-10-31 23:59:59.953000
Number of discontinuities: 1
Discontinuities:
0
start_id 2985355209
end_id 2985456075
missing_ids 100865
pre_gap_time_str 2022-10-25 22:52:24.171000
post_gap_time_str 2022-10-26 00:00:00.140000
time_interval_str 0 days 01:07:35.969000
===================================================================
Month: 2022-11
=====================================================================
Missing data: 0.06%
First timestamp: 2022-11-01 00:00:00.079000
Last timestamp: 2022-11-30 23:59:59.954000
Number of discontinuities: 1
Discontinuities:
0
start_id 3010173658
end_id 3010191331
missing_ids 17672
pre_gap_time_str 2022-11-02 23:36:02.056000
post_gap_time_str 2022-11-03 00:00:00.120000
time_interval_str 0 days 00:23:58.064000
===================================================================
Month: 2023-02
=====================================================================
Missing data: 0.02%
First timestamp: 2023-02-01 00:00:03.361000
Last timestamp: 2023-02-28 23:59:59.951000
Number of discontinuities: 1
Discontinuities:
0
start_id 3252878318
end_id 3252889108
missing_ids 10789
pre_gap_time_str 2023-02-01 23:52:24.550000
post_gap_time_str 2023-02-02 00:00:00.006000
time_interval_str 0 days 00:07:35.456000
===================================================================
Month: 2023-03
=====================================================================
Missing data: 0.59%
First timestamp: 2023-03-01 00:00:00.053000
Last timestamp: 2023-03-31 23:59:59.974000
Number of discontinuities: 3
Discontinuities:
0 1 \
start_id 3403838556 3414885362
end_id 3403892686 3415203953
missing_ids 54129 318590
pre_gap_time_str 2023-03-13 23:42:42.232000 2023-03-14 22:33:51.892000
post_gap_time_str 2023-03-14 00:00:00.009000 2023-03-15 00:00:00.042000
time_interval_str 0 days 00:17:17.777000 0 days 01:26:08.150000
2
start_id 3422922336
end_id 3423506983
missing_ids 584646
pre_gap_time_str 2023-03-15 21:01:12.904000
post_gap_time_str 2023-03-16 00:00:00.139000
time_interval_str 0 days 02:58:47.235000
===================================================================
Month: 2023-11
=====================================================================
Missing data: 0.51%
First timestamp: 2023-11-01 00:00:05.027000
Last timestamp: 2023-11-30 23:59:59.558000
Number of discontinuities: 2
Discontinuities:
0 1
start_id 4308908463 4309400061
end_id 4309326419 4309407327
missing_ids 417955 7265
pre_gap_time_str 2023-11-21 02:43:24.004000 2023-11-21 10:29:53.003000
post_gap_time_str 2023-11-21 08:51:39.707000 2023-11-21 10:38:48.127000
time_interval_str 0 days 06:08:15.703000 0 days 00:08:55.124000
Conducting another analysis based on timestamp discontinuity gives the following result for greater gaps than 1 minute:
{'/trades/2020-01': [],
'/trades/2020-02': [],
'/trades/2020-03': [],
'/trades/2020-04': [(Timestamp('2020-04-21 17:23:02.498000'),
Timedelta('0 days 00:01:00.127000'))],
'/trades/2020-05': [],
'/trades/2020-06': [],
'/trades/2020-07': [],
'/trades/2020-08': [],
'/trades/2020-09': [(Timestamp('2020-09-27 11:15:29.047000'),
Timedelta('0 days 00:03:06.480000')),
(Timestamp('2020-09-27 11:19:29.322000'),
Timedelta('0 days 00:01:04.580000')),
(Timestamp('2020-09-27 11:23:55.939000'),
Timedelta('0 days 00:01:16.696000')),
(Timestamp('2020-09-27 11:27:22.012000'),
Timedelta('0 days 00:01:01.368000')),
(Timestamp('2020-09-27 11:29:48.269000'),
Timedelta('0 days 00:01:07.384000'))],
'/trades/2020-10': [],
'/trades/2020-11': [],
'/trades/2020-12': [],
'/trades/2021-01': [],
'/trades/2021-02': [(Timestamp('2021-02-21 00:00:00.002000'),
Timedelta('0 days 00:03:27.147000')),
(Timestamp('2021-02-23 00:00:00.011000'),
Timedelta('0 days 00:05:35.606000'))],
'/trades/2021-03': [(Timestamp('2021-03-02 02:00:00.012000'),
Timedelta('0 days 00:59:51.398000'))],
'/trades/2021-04': [],
'/trades/2021-05': [],
'/trades/2021-06': [],
'/trades/2021-07': [],
'/trades/2021-08': [],
'/trades/2021-09': [],
'/trades/2021-10': [],
'/trades/2021-11': [],
'/trades/2021-12': [],
'/trades/2022-01': [],
'/trades/2022-02': [(Timestamp('2022-02-15 00:00:00.073000'),
Timedelta('0 days 16:31:55.521000'))],
'/trades/2022-03': [],
'/trades/2022-04': [],
'/trades/2022-05': [(Timestamp('2022-05-01 22:55:10.707000'),
Timedelta('0 days 00:29:33.304000')),
(Timestamp('2022-05-28 17:15:00.492000'),
Timedelta('0 days 00:35:23.272000'))],
'/trades/2022-06': [],
'/trades/2022-07': [],
'/trades/2022-08': [],
'/trades/2022-09': [(Timestamp('2022-09-22 00:00:00'),
Timedelta('0 days 01:17:19.927000'))],
'/trades/2022-10': [(Timestamp('2022-10-26 00:00:00.140000'),
Timedelta('0 days 01:07:35.969000'))],
'/trades/2022-11': [(Timestamp('2022-11-03 00:00:00.120000'),
Timedelta('0 days 00:23:58.064000'))],
'/trades/2022-12': [],
'/trades/2023-01': [],
'/trades/2023-02': [(Timestamp('2023-02-02 00:00:00.006000'),
Timedelta('0 days 00:07:35.456000'))],
'/trades/2023-03': [(Timestamp('2023-03-14 00:00:00.009000'),
Timedelta('0 days 00:17:17.777000')),
(Timestamp('2023-03-15 00:00:00.042000'),
Timedelta('0 days 01:26:08.150000')),
(Timestamp('2023-03-16 00:00:00.139000'),
Timedelta('0 days 02:58:47.235000'))],
'/trades/2023-04': [],
'/trades/2023-05': [],
'/trades/2023-06': [],
'/trades/2023-07': [],
'/trades/2023-08': [],
'/trades/2023-09': [(Timestamp('2023-09-12 08:53:39.137000'),
Timedelta('0 days 00:19:50.856000'))],
'/trades/2023-10': [],
'/trades/2023-11': [(Timestamp('2023-11-21 08:51:39.707000'),
Timedelta('0 days 06:08:15.703000')),
(Timestamp('2023-11-21 10:38:48.127000'),
Timedelta('0 days 00:08:55.124000'))],
'/trades/2023-12': [],
'/trades/2024-01': [],
'/trades/2024-02': [],
'/trades/2024-03': [],
'/trades/2024-04': [],
'/trades/2024-05': [],
'/trades/2024-06': [],
'/trades/2024-07': [],
'/trades/2024-08': [],
'/trades/2024-09': [],
'/trades/2024-10': [(Timestamp('2024-10-28 16:35:13.575000'),
Timedelta('0 days 00:14:30.392000')),
(Timestamp('2024-10-28 16:37:07.597000'),
Timedelta('0 days 00:01:22.471000'))],
'/trades/2024-11': [],
'/trades/2024-12': [],
'/trades/2025-01': [],
'/trades/2025-02': [],
'/trades/2025-03': [],
'/trades/2025-04': [],
'/trades/2025-05': []}
The latter results are more concerning, as they show more discontinuities than the ID-based approach. This means that there is a temporal trade discontinuity, while trades are not missing.
P.s.:
I've validated the downloaded data with checksums.
Metadata
Metadata
Assignees
Labels
No labels