Contents
This page has additional resources for the paper entitled "CID: An Efficient Complexity-Invariant Distance for Time Series" accepted by the Data Mining and Knowledge Discover Journal. A previous version of this paper was published in the SIAM Conference on Data Mining (SDM 2011). The resources available here include tables with detailed experimental results, source code necessary to replicate our experiments, all data used in experimental evaluation, and some supplemental material not included in the paper.
All material available in this website is password protected. Please, contact me to obtain the archive password.
Detailed Experimental Results
We provide detailed numerical results (not shown in the paper due to brevity) in spreadsheet format (MS Excel). These numerical results include accuracy and Rand index values (per data set) in the comparisons of CID with Euclidean and DTW distances. We also included detailed numerical results comparing nine methods to estimate the complexity of time series and runtime results comparing CID, Euclidean and DTW distances.
Source Code
We provide the source code to replicate all figures with CID evaluation in the paper. We have designed all experiments such that they are not only reproducible, but easily reproducible. Therefore, we made a slide presentation file (MS PowerPoint), each slide has the source code to reproduce a different figure in our paper. Simply copy and paste the code in a text editor and save it with the name provided in the side. Run the source code in MATLAB. Please, note that some scripts need to access some of the data provided in the next section.
Data
Here is all the data we used in the paper:
- All 43 data sets used in experimental comparisons of CID. File size: 85.5MB. This spreadsheet file (MS Excel) has a brief description of each data set;
- The geometric figures used to illustrate CID;
- The leaves data also used to illustrate CID;
- The lower bounding data to show that CID can be effectively indexed with embedded techniques;
Supplemental Material
We also prepared a slide presentation file (MS PowerPoint) with an explanation of our results.
In Section 2.2, we discuss that some problems may require multiple invariances. This code and data is necessary to reproduce the experiments. The code requires a GPU and we thank Abdullah Al Mueen for helping us. The interested reader might read the paper "Accelerating Dynamic Time Warping Subsequnce Search with GPUs and FPGAs" by Doruk Sart, Abdullah Mueen, Walid Najjar, Vit Niennattrakul, and Eamonn Keogh for further details regarding the application of GPUs to speed up DTW calculation.