A dataset for STUMPY #662
NimaSarajpoor
started this conversation in
General
Replies: 1 comment 2 replies
-
@NimaSarajpoor Thanks for sharing, this is interesting. We may be able to add this as a "Bonus Section" to our Shapelet Discovery tutorial? The key question is whether this performance is significant. I don't know. Additionally, the dataset may be hard for people to follow (it took me a minute to understand that it is essentially many, many sets of time series for many people) |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have been working on a Kaggle competition lately American Express Fault Prediction I didn't get a chance to improve my work and submit my prediction as I started just a few days before the deadline, but I think it might be a good dataset for STUMPY and I just want to share it with the community. So, if you think it is not appropriate or not worth it, please feel free and close it.
The data is tabular; however, it is 3D! Each row corresponds to one person (one observation), and each column corresponds to one feature. However, each feature itself is a timeseries data (depth of table). While the depth of table is the same in all features for each person, it can be different from one person to another.
I used a combination of shapelet discovery and multi-dimensional match to predict the label. I did not train a model based on the distance of each pattern to the shapelets. Instead, I just considered KNN and used a voting policy. Long story short, the model performs well in predicting the observations with true label 1. However, regarding the observations with true label 0, the performance was about 50%.
I do not have enough time to continue this work. However, it might be a good problem for those who are interested in using STUMPY and exploring its power in different domains.
Beta Was this translation helpful? Give feedback.
All reactions