Using # in alpha list for a dirichlet distribution?
Hi all,
I recently started using treeage to create a markov model to run a microsimulation and I've run into a situation creating an alpha list for a dirichlet distributions that I'm not sure how to address efficiently.
As a simplified example, I have a node with 4 outcomes (A,B,C,D). pB is 0.5 and pC is 0.2. pD varies depending on the stage and references a table (tD) of probabilities. pA would then be 1-pB-pC-pD or #.
When I created the alpha list for a Dirichlet distribution I tried: List (#; 0.5; 0.2; tD[_stage])
but it didn't seem to like that. I tried "#", but I guess that treats it like a string and it doesn't like that either.
Is there a way to use # in this situation or would I have to do something like "1-0.5-0.2-tD[_stage]" for the first value in the alpha list?
Thanks
Comments
I believe that # is only able to be calculated when its used in a probability expression - which is potentially why the issue arises. Its not a number, or cannot be calculated as a number in your array.
My solution would be something like you have suggested which ensures those values in the List all sum to 1:
List (1-0.5-0.2-tD[_stage]; 0.5; 0.2; tD[_stage])
Hi
The Dirichlet is expecting a list of counts (how many people in each category). It's an extension of the beta distribution which has two categories: r individuals in one category and N-r individuals in the other.
You might consider setting up a table indexed by _stage with the counts in each category as four data columns. At the root, define alpha1 = tblCounts[_stage;1]......alpha4 = tblCounts[_stage;4]
The alpha list for the Dirichlet distribution, say d_Category is then alpha1;....;alpha4
Then set up a chance node with 4 branches with probability expressions dist("d_Category";x) where x = 1 on the first branch to x = 4 on the last branch.
This will always produce coherent probabilities at the chance node (i.e. will always sum to 1).
David is right. The alpha list for Dirichlet distributions must be numeric values. Placing any formula that references _stage in the alpha list will not change the distribution by cycle because distributions are by convention evaluated and sampled once at the beginning of the analysis (_stage would be 0).
If you wanted a distribution to change by cycle, you actually need distinct distributions for each cycle, then reference the appropriate distribution by cycle.
Note that a variable array can store a list of distributions. It could be referenced with _stage+1 because _stage starts at 0 and array lists start at 1.
Andrew
Could the default sampling rate for the distribution be set to per cycle, Andrew? Or could distforce() be used at the beginning of each cycle? Either way, then set up a table as I suggested? Good point about the need to use _stage + 1 to make calls to the table. I make that clear to students in my course.
Thanks everyone for your patience and explaining what I need to do; I've clearly been operating under numerous misconceptions.
I'll take a stab at implementing this and see how it goes.
Thanks again!
David, while it is possible to change the sampling rate to per cycle, it will be counterproductive if the plan is to use that distribution for PSA. With PSA, a single sample should be used for the entire model calculation. It should not change by patient or by patient cycle. Therefore, I recommend against using that sampling rate.
If instead that distribution was trying to sample differently for each patient and cycle as part of a Microsimulation, then the sampling rate of per cycle would be appropriate.
Andrew
Please sign in to leave a comment.