Hacker Newsnew | past | comments | ask | show | jobs | submit | eamonnkeogh's commentslogin

The Matrix Profile, is there anything it cant do ;-)


When we wrote "Clustering of time-series subsequences is meaningless", it took six attempts to get it published.

One reviewer wrote (word for word) "you will get in bad trouble if you publish this"


Dated now, but [a]

[a] Time Warps, String Edits, and Macromolecules The Theory and Practice of Sequence Comparison David Sankoff and Joseph Kruskal, with introduction by John Nerbonne


Fragments of it I found look promising. Thank you!


Thank you for your kind words ;-)


The MP is so efficent that you can test ALL window lengths at once! This is called MADRID [a].

[a] Matrix Profile XXX: MADRID: A Hyper-Anytime Algorithm to Find Time Series Anomalies of all Lengths. Yue Lu, Thirumalai Vinjamoor Akhil Srinivas, Takaaki Nakamura, Makoto Imamura, and Eamonn Keogh. ICDM 2023.


Thanks for your kind words (I am one of the authors) For those interested, there is an expanded review/critique of datasets here [a]

There is also a video here [b]

[a] https://www.dropbox.com/scl/fi/cwduv5idkwx9ci328nfpy/Problem...

[b] https://www.youtube.com/watch?v=Vg1p3DouX8w&


I am currently reading your article with great interest as I am about to embark on implementing time series anomaly detection algorithm(s) for predictive maintenance applications. I would like to avoid unnecessarily complex algorithms that are unlikely provide real, practical benefits.


Please let me know if I can help.

A paper published today finds an anomaly in an "InternalBleeding" dataset, after setting eighteen parameters [a].

Could we find the anomaly with a completely parameter-free algorithm? As the figure below shows, the answer is YES, if you use MADRID [b].

One line of code >>MADRID(UCRAnomalyInternalBleeding)

So Yes, simple is better.

[a] Learning Rate, Dropout Rate, Dim Feedforward, Batch Size, Encoder Layers, Decoder Layers, Activation Func, Time Warping, Time Masking, Gaussian Noise, Linear Embedding, Phase Type, Self Conditioning, Layer Norm, Pos. Enc. Type, FFN Layers, Window Size.

[b] https://www.dropbox.com/scl/fi/hd9gt0xs8v8mrsx3upwd3/ICDM23_...


It is possible that two occurrences of the same motif can overlap. And It is possible that two different motifs can overlap. Lets see both cases, in string analogs. We will start with the second case, using an example from John Cleese…

“…itself…and hence the very meaning of life itselfish bastard, I'll kick him… selfish…” Here there is a motif “itself” and there is a motif “selfish”. Note that one occurrence of each motif appears overlapping in “itselfish”. --- Now for the first case: “….soihsehihrhewCOMICOMICireoqiwwherhqwe…”

Here we have a motif “COMIC”, but they share a letter, the central ‘C’. We can allow motifs to share more letters, but they cannot share ALL letters, that would be a trivial match.

The matrix profile has a simple parameter (the exclusion zone) that lets you control how much overlap you want to allow.


Assuming this to be the reply to my question:

I probably was a bit imprecise but what I want to know is if there is a way to apply this to data that are possibly in a superposition and overlapping meaning that you only see the sum of the events. For example if one wants to analyze a changing electric or magnetic field.

Nevertheless, the points you mentioned are something I did not think about at first, interesting once again.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: