# Matrix algebra within TreeAge?

Hi,

I am new to TreeAge and in the beginning stages of developing a microsimulation model. My plan is to generate transition probabilities (TPs) between 5 states as a function of individual-level characteristics using a multinomial logit model estimated on survey data. The two waves of the survey were conducted three years apart, so the TPs represent 3 year transitions. We intend to use a cycle length of one year, so we need to convert the 3y TPs to 1y TPs by taking the cube root of the 3Y TP matrix. Because the TPs are functions of individual-level characteristics, some of which vary over time, we will need to do this matrix algebra for each individual at each cycle. Is it possible to do such a thing within TreeAge, and would doing so slow the model down to the point where it would be impractical to simulate many individuals over many cycles?

Thanks in advance for any insight you can provide.

• Official comment

There is no matrix exponentiation capability within TreeAge Pro.  You could use Matlab or Python with SciPy to perform cube roots of matrices and then place these derived cohort proportions into the 1 year cycle transition probabilities within TreeAge Pro model.

We are considering integration with Python/SciPy in the future.  Currently TreeAge Pro uses a simpler Python for Java which does not support these more complex functions.

With all that said, finding matrix roots is only applicable under assumption that all of the transitions between states are exponential (constant rates).  Which may or may not be appropriate for the phenomenon you are planning to model as you indicate they vary with time.

A much more likely scenario is that some or all of the transition are not exponential and should be modeled with some parametric distribution (Weibull, GeneralizedGamma, etc) or even with empirical Kaplan-Meier curve.  In such case applying Matrix root finding technique will result in bias.  The correct approach is to model these transitions individually and then establish a "short enough" cycle length for your model so that simulated results match the observed state of the cohort at the 2 points in time. (I guess more detail state to state survival analysis is not available?)

Fitting an appropriate distribution and and finding correct parameters in order to match the observed cohort state is really a different problem of searching the parameter space for a minimal error.  This would also be computationally expensive and there is no way to do that directly within TreeAge Pro.  An external code in Java or VB Excel could be written to set parameters of TreeAge Pro model run the analysis collect that data and then bring the results back into Java or VB to calculate the error and then select next set of values for the parameters.  Ideally using some well known optimization algorithm (e.g. simulated annealing, gradient descent, etc.).  In any event this is computationally intensive and prone to getting stuck in a "local minimum".

I am working on white paper that talks about some of the relevant topics of Matrix Algebra, Survival Analysis and Markov and DES models. Feel free to reach out to me on support@treeage.com for more details.

• Here's another possible approach.  Use your data to estimate parameters of the multinomial logistic regression using SAS or Stata or similar software.  In TreeAge, use tracker variables to calculate the individual's characteristics that change over time.  At each cycle, calculate the individual's probability of transitioning to each state using the estimated parameters and the individual's invariant variables (e.g. sex) and time-dependent variables using trackers (e.g. age).  If you want to do a PSA, you can sample the distributions of the parameter estimates before calculating the individual's (or sample's) probabilities. Once you have the 3-year TPs, you can back-calculate the 1-year TPs using roots or assuming exponential survival.

I hope this makes sense.  My apologies for misunderstanding your issue if it doesn't.

• Thank you both for your responses.

Melissa, is there a way to back-calculate the 1-year TPs from a 3-year TP matrix without taking roots?

Best,

Bill

• I'm not sure this is what you're asking, but you COULD assume exponential survival (i.e., survival from transitioning, so TP = 1 - survival).  Solve for L (lambda) with your survival at 3 years and then substitute 1 for t:

Survival at time t = e ^ (-Lt)

If this isn't what you're asking, let me know.

Melissa

• Indeed as Melissa, points out you can calculate the survival rate (lambda) from the survival data.  However, in models with 3 and more states the proportions of cohort observed at time x (e.g. 3 year), are likely to have competing risk transitions (e.g. from state 1 to state 2 and from state 1 and 3, etc).  There is no easy way to disentangle the competing risks (rates), without detail longitudinal study data for each individual transition rate.

Also make sure you understand the differences between transition probabilities derived from cohort proportions observed at some single point in the study (e.g. 3 year). Transition probabilities derived from survival function (hazard functions) for each transition are different (I call them "pairwise transition probabilities").  Both can be used within Markov models as actual transition probabilities, but they rely on different assumptions and cycle time adjustments have to be done differently.

1.  State of the cohort proportions used as TP assume exponential distributions (but individual lambdas are not necessarily known).  Time cycle adjustment by matrix root finding is the correct time cycle adjustment method.

2. Pairwise survival (hazard) functions, not limited to exponential (any parametric or tabular survival data points).  Time cycle adjustment is done as Melissa pointed out for the case of exponential distributions.  For a more general case look at the Hazard function formula in Briggs et.al. "Decision Modelling for Health Economic Evaluation" page 50-55.

So if you only have cohort proportions, you can try to estimate the individual rates and then build model of type 2 above and use the survival(hazard) function time adjustments for shorter time cycle (it will have to be shorter than 1 year in order for the competing risks to be allowed to take place at reasonable frequency - you do not want for example for a stroke to be able to happen only every 1 full year time cycle). Further you will need to calibrate the model to make sure that your estimated rates result on the observed cohort proportions at the observed time of year 3.

• Thank you both!