rule-check • delaydiscount

library(delaydiscount)

The jb_rule_check function can be used to apply the rule check for the purpose of filtering out subjects who fail.

remedi_rc_result <- jb_rule_check(remedi)
head(remedi_rc_result)
#>    group     subj   C1    C2
#> 1    EFT 10458654 TRUE  TRUE
#> 8    EFT 10698816 TRUE  TRUE
#> 15   EFT 11624253 TRUE  TRUE
#> 22   EFT 11646612 TRUE  TRUE
#> 29   EFT 12088534 TRUE  TRUE
#> 36   EFT 12691633 TRUE FALSE

The resulting data frame has an observation for each subject-within-group, with variables corresponding to whether or not each rule was passed. A value of TRUE indicates that the rule check was passed.

C1 fails if there are two or more increases in the indifference point of 20% of the total amount between consecutive time points. C2 fails if the difference in the indifference point between the first and last time points is not more than 10% of the total amount.

In order to make use of the rule check results, after the data frame is prepared, the rule check data can be merged with the original dataframe, and then observations corresponding to subjects who failed the rule check can be removed from the data.

remedi_with_rc <- merge(remedi, remedi_rc_result)
remedi_rc_pass <- dplyr::filter(remedi_with_rc, 
                                C1 == TRUE, C2 == TRUE)

After that, the remainder of the workflow proceeds as it would otherwise, starting from running the prepare_data_frame function on the dataset with rule check failures filtered out. The prepare_data_frame function performs helper calculations for our other functions.

prep_remedi <- prepare_data_frame(remedi_rc_pass)

One of the things that the prepare_data_frame function does is transform the delay and indifference variables to linearize the model.

Note that the hyperbolic model for the discounting curve is $D(t) = \frac{1}{1+kt}$ .

Then we will have $\log(\frac{1}{D(t)} - 1) = \log(k) + \log(t)$ .

We assume that the transformed observed indifference point $\tilde D(t_{ijc})$ will follow the model

$\log(\frac{1}{\tilde D(t_{ijc})} - 1) = \log(k_{ic}) + \log(t_{ijc}) + \epsilon_{ijc},$

where $\epsilon_{ijc} \sim N(0, \sigma^2)$ , $i$ is an index for subject, $j$ an index for time point, and $c$ is an index for group, and

$\log(k_{ic}) \sim N(\theta_c, \frac{g\sigma^2}{T})$ , where $\theta_c$ is the population mean of the $\log(k_{ic})$ for subjects in group $c$ , and $T$ is the number of time points observed for each subject.

Using the prepared data frame, you can get estimates of the $log(k)$ values for each subject using the get_subj_est_ln_k function.

ln_k_ests <- get_subj_est_ln_k(prep_remedi)
head(ln_k_ests)
#> # A tibble: 6 × 3
#> # Groups:   group [1]
#>   group     subj  ln_k
#>   <chr>    <int> <dbl>
#> 1 EFT   10458654 -7.73
#> 2 EFT   10698816 -6.39
#> 3 EFT   11624253 -6.69
#> 4 EFT   11646612 -6.58
#> 5 EFT   12088534 -8.41
#> 6 EFT   15110483 -7.29

Our model treats the $\log(k)$ values for individual subjects as the result of a combination of a fixed group effect and a random subject effect. To get the fit of the linearized hyperbolic model, use the dd_hyperbolic_model function.

model_fit <- dd_hyperbolic_model(prep_remedi)

The object produced by this method is a list containing several components.

The group mean $\log(k)$ estimates (i.e. the $\theta_c$ ) can be gotten from the ln_k_mean component.

model_fit$ln_k_mean
#>     group ln_k_mean   std_err
#> EFT   EFT -6.563498 0.1545544
#> HIT   HIT -5.905766 0.1336723
#> NCC   NCC -5.982325 0.1194661

These are estimates of the mean of the distribution from which the $\log(k_{ic})$ values for subjects in group $c$ have been realized.

The variance component estimates can be obtained from the var component.

model_fit$var
#> sigma_sq        g 
#> 1.595641 8.955180

sigma_sq is the estimate of the variance of the transformed indifference value conditional on the subject effect.

g is related to the variance of the subject random effects: specifically, the variance of the subject random effect is equal to $\frac{g\sigma^2}{T}$ , where $T$ is the number of time points observed for each subject.

These mean and variance parameters are estimated through maximum likelihood estimation.

An overall F-test for the equality of all group mean parameters can be found in the model_test component.

model_fit$model_test
#>     F_stat     p_value df1 df2
#> 1 5.989811 0.002748074   2 378

Pairwise F-tests for equality of group mean parameters can be obtained from the pairwise_f_tests component. The data frame contains an F-test for each pair.

model_fit$pairwise_f_tests
#>   group_1 group_2    F_stat     p_value df1 df2
#> 1     EFT     HIT 10.279025 0.001460043   1 378
#> 2     EFT     NCC  8.781681 0.003235316   1 378
#> 3     HIT     NCC  0.180928 0.670819017   1 378

Note that, as with other R functions such as lm and glm, these p-values are not adjusted for multiplicity.