-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DynamicHMC fails on LBA #45
Comments
Hi Chris, Looking at the results produced by LBA, I noticed that v[1..3] == [A, k, tau]:
This is already evident from the posterior samples;
Like Tamas, I don't really understand this model, trying to figure it out using LBA. Is that a good description of the model? |
Good catch. It looks like something in my conversion to a Chain object broke. I'll look into that shortly. The most comprehensive source on the model can be found here Equations 1,2 and 3 comprise the model. Basically, the the model boils down to the distribution of the minimum of several random variables. Each observation consists of a variable index and a minimum processing time. For example, equation 3, represents the density for the ith random variable having the minimum value t (first term) and all j != i being larger than t (second term). Figure 3 has a fairy intinutive illustration. The substantive interpretation is that each random variable is the finishing time of evidence accumulators for decision options. So for example if a person is choosing which of two objects is brighter on a computer screen, the objectively brighter option will have a higher drift rate (evidence accumulation rate) and will tend to hit an evidence threshold first. If you tell people to favor accuracy over speed, the evidence threshold parameter usually increases. So that is the gist of the model. I'm not sure if that's helpful. I can do my best to answer specific questions. |
Thanks, I'll read that description of the model. I don't think it is your (or mine) conversion to a Chain object as the posterior shows that in each iteration (sample) the v's are identical to the other 3 parameters. |
Yup. You are right. Sorry. I misunderstood. This seems to be working correctly: If you replace the the problem function with this
This suggests that the values are wrong during sampling. I wonder whether there is a bug in the transformation step:
I think I have specified it correctly. |
Hi Chris, After the latest update to TransformVariables.jl I get below results for a simplified LBA model (2 choices, as in figure 3 of the reference):
v[1] and v[2] are now no longer identical A and k. The results look better as well. As comparison, generating the same data, I use lba1.jl in the LBA example in StanSample.jl:
Next step is to figure out a comparison between the functions as in the Stan model and in LBA_functions.jl (as in the LBA benchmark in DynamicHMCModels.jl) and what exactly the error
|
Thanks, Rob. Do you think the problem is with LBA_functions.jl or with DynamicHMC? My guess is that the problem is with DynamicHMC because it works with Turing. |
Hmm, I'm really not sure yet, but I was encouraged by the fact that the estimates are not identical, but reasonably close. Maybe above error is some aspect in the definition of the distribution that is not handles properly and is irrelevant (or not used) in Turing. My next step is to suppress the initial call to Optim to obtain the starting values to see if that gets us through above error. |
Good point. Another difference that might be worth considering is that DynamicHMC uses different parameter values for NUTS (and this may also affect the initial values). Turing is now identical to Stan, and is very numerically stable. It also looks like both packages use different variable transformations. |
Next tiny step, but indeed if I suppress the Optim part as shown below
I get
Next step is to see what the density looks like for these values. |
What parameter has the value of -.24? That might be a problem because only the v parameters can be negative. |
Not sure, I'm assuming A?
I've certainly seen other parameters taking on a negative's value:
|
Yeah. That is problematic. I suspect that part of DynamicHMC that enforces the allowable ranges of parameters is the source of the problem. For reference, here is the part of the model that specifies the ranges:
|
I'm not sure that the problem is due to numerical instability of the likelihood function. However, I want to make reference to a more numerically stable version of the likelihood that Tamas started. I will find some time to finish this and add it, as it should be more stable with larger sample sizes.
|
I found some translation errors in new code for the likelihood function (file here). As far as I can tell, it is not possible to convert all of the terms in the likelihood function into logarithms with existing capabilities. The main problem is that many of the terms contain negative values/subtraction, making Nonetheless, I was able to use logarithms at the level of functions, which did not solve the immediate problem. An undesired side effect of this translation was a 50% reduction in speed in the evaluation of the likelihood. I suspect this might be due in part to the fact that logarithms are more computationally intensive than elementary arithmetic operations. I'll take a look at the values that produce the errors to see if I can get an idea about the cause. |
Hi Chris, tried your updates, not with a lot of success though. Most fail in the Optim phase, not able to determine an ok starting point. If I disable that, and find a random seed that works, it shows huge numbers. Not sure if I completely understand your point about I wonder if we can create a logpdf version in Stan and if that provides more insight what is happening. I must say after my last debacle (the one model in SR that didn't provide similar answers in DynamicHMC and Stan) where I also found that in the end I had made a modeling mistake, I think the problem is very likely in the modeling. |
Sorry Rob, you can disregard my previous post (I misread the parentheses and misdiagnosed the error in Tamas's code). If you recall, suspected Tamas suspected that the errors were due to numerical under/overflow. He recommended rewriting the likelihood function in terms of logarithms because they are more robust to numerical problems. He made an attempt to rewrite the likelihood function. It turned out to be close, but it had an error. So I made my own version with fewer logs. As you pointed out, the problem still persisted. However, after reviewing Tamas's code again, I realized I misread something and was able to identify the error:
I incorporated this into my version of the logpdf from before. However, the problem persists with this version. Sporadic numerical errors occur in Stan with the LBA. I suspect there are parts of the parameter space that have irregular properties and this might be unavoidable. It might be worth letting DynamicHMC run after an error. If it is like Stan, the errors will be infrequent and won't affect the estimates. I'll see if its still possible to let it run after errors. |
Rob- I encounter a domain error with this version of the LBA, which disables the initial optimization step.
My running assumption is that this vector represents a set of sampled parameters, which should not be negative given the constraints: Is my interpretation correct? |
Hi Chris, That's what I assumed. And I agree with your remark on the sporadic errors. During and since my return from my trip I kind of prioritized the work on my plate given that I also need to spend more time on a FEM package for directional drilling for geothermal purposes. Unfortunately as a consequence the LBA problem (and Jags) ended up pretty far down on my list. Clearly the LBA problem is important and related where I want to go with StatisticalRethinking v1 but I think we need Tamas' help to figure this one out now you have tried many things and feel pretty ok with the current formulation. |
Thanks for your input. I will contact Tamas to see if this is indeed a bug and whether he has any recommendations for any remaining issues. |
In the past I tried to get stresstest to work but clearly didn't do it right. Tamas' example shows it is a great tool! The transformed (backwards) look like decent values to me although tau is above minrt:
(using your Random.seed value) |
With your suggested restriction on tau stresstest ends up with 9 problematic samples and indeed points to line 79 in the test script (which makes sense). It seems to me we can't filter out the parameter samples rows which, given the data, won't be acceptable here. I guess this can only be done deeper inside the sampling process. Is this indeed how Stan does it? |
After correcting the prior distributions for the LBA, we are now encountering the non-finite density problem again:
Changing the following:
back to:
chain, NUTS_tuned = NUTS_init_tune_mcmc(∇P, nsamples; report=ReportSilent());
fixes the problem, but causes DynamicHMC to slow to a crawl.
The text was updated successfully, but these errors were encountered: