Lists of open-access functional datasets from different fields of application. We only collect data that can be used for cluster analysis. The main objective is to facilitate comparing with existing clustering methods (for functional data) and evaluating new clustering methods. A recent comprehensive review of clustering methods for functional data is available here. Our team is actively developing functional data clustering methods tailored to various data types and application domains. The software tools we have developed can be accessed here.
For datasets that need further processing on the linked data, a copy of them can be found in the Data folder. (This ongoing project is a bit slow, due to other commitments of the contributor.)
Name | Available at | Field | Task | Size | Length | Missing Value |
---|---|---|---|---|---|---|
ARC_Mobile | Publisher | Health | Clustering | 125 | 30/40 | Yes |
ArrowHead | UEA & UCR Time Series Classification Repository | Computer Vision | Classification | 211 | 251 | No |
BirdChicken | UEA & UCR Time Series Classification Repository | Computer Vision | Classification | 40 | 512 | No |
BTH_PM25 | Publisher | Environment | Clustering | 73 | 48 | Yes |
China_PM25 | Publisher | Environment | Clustering | 338 | 731 | Yes |
DiatomSizeReduction | UEA & UCR Time Series Classification Repository | Bioinformatics | Classification | 322 | 345 | No |
ECG200 | UEA & UCR Time Series Classification Repository | ECG | Classification | 200 | 96 | No |
FaceFour | UEA & UCR Time Series Classification Repository | Computer Vision | Classification | 112 | 350 | No |
Flour | R (cfda) | Food | Classification | 115 | 241 | No |
Fungi | UEA & UCR ... | Bioinformatics | Classification | 204 | 201 | No |
GunPoint | UEA & UCR Time Series Classification Repository | Motion | Classification | 200 | 150 | No |
Meat | UEA & UCR Time Series Classification Repository | Food | Classification | 120 | 448 | No |
Plane | UEA & UCR ... | Shape | Classification | 210 | 144 | No |
Phoneme | e-Book (ElemStatLearn) | Speech | Classification | 4K+ | 256 | No |
Strawberry | UEA & UCR Time Series Classification Repository | Food | Classification | 983 | 235 | No |
Symbols | UEA & UCR Time Series Classification Repository | Computer Vision | Classification | 1K+ | 398 | No |
Tecator | CMU StatLib | Food | Classification | 240 | 100 | No |
Name | Available at | Field | Task | Size | Length | Dimension |
---|---|---|---|---|---|---|
BasicMotions | UEA & UCR Time Series Classification Repository | Motion | Classification | 80 | 100 | 6 |
Blink | UEA & UCR ... | EEG | Classification | 950 | 510 | 4 |
ECG_Arrhythmia | Publisher | ECG | Classification | 10K+ | 5000 | 12 |
EEG_Full | UCI Machine Learning Repository | EEG | Classification | 122 | 256 | 64 |
Epilepsy | UEA & UCR ... | Motion | Classification | 275 | 207 | 3 |
ERing | UEA & UCR ... | Gesture | Classification | 300 | 65 | 4 |
EyesOpenShut | UEA & UCR ... | EEG | Classification | 98 | 128 | 14 |
Japanese_Vowels | UCI Machine Learning Repository | Speech | Classification | 640 | 29 | 12 |
UWaveGestureLibrary | UEA & UCR ... | Gesture | Classification | 4K+ | 315 | 3 |
Name | Available at | Field | Task | Size | Length | Dimension |
---|---|---|---|---|---|---|
Hypersphere | Data/Manifold | simulated | Clustering | ... | ... | ... |
Hyperbolic | Data/Manifold | simulated | Clustering | ... | ... | ... |
Lorenz | Data/Manifold | simulated | Clustering | ... | ... | ... |
Pendulum | Data/Manifold | simulated | Clustering | ... | ... | ... |
Swiss _Roll | Data/Manifold | Simulated | Clustering | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... |
We list below a few popular repositories, where you can find more functional datasets for cluster analysis.