Differentially accessible or expressed windows

Finding windows that correspond to differentially expressed or accessible windows is possible with two related functions in atacr - estimate_fdr() which implements bootstrap t-tests, via the boot package and estimate_bayes_factor() which implements a Bayes factor ANOVA using the BayesFactor package. A tidy dataframe of results is returned in each case.

Bootstrap t-tests

For simple comparison of two treatments with bootstrap t tests, provide treatment ‘a’ and ‘b’ names and the number of bootstrap iterations (default is 10, which is fast for testing code, but useless analytically). You can set the threshold for marking as significant with fdr_level.

 result <- estimate_fdr(normalized_counts,
              treatment_a =  "treatment",
              treatment_b = "control",
              iterations = 100000,
              fdr_level = 0.01)

##                  window         t p_value       fdr mean_count_a
## 1    synth_chrom:1-50:- -1.938336     0.1 0.2130435    5632.6667
## 2  synth_chrom:51-100:+ -1.218827     0.2 0.3062500    9758.6667
## 3 synth_chrom:101-150:-  4.107091     0.0 0.0000000     205.3333
## 4 synth_chrom:151-200:- -1.510404     0.2 0.3062500   15202.6667
## 5 synth_chrom:251-300:-  3.308530     0.0 0.0000000   39171.0000
## 6 synth_chrom:301-350:-  1.400435     0.3 0.3868421      62.0000
##   mean_count_b        sd_a         sd_b   log2_fc is_sig
## 1  15382.66667  6377.90407  5935.290164 -1.449416  FALSE
## 2  20613.66667  4435.08403 14774.507279 -1.078845  FALSE
## 3     83.66667    50.46121     9.291573  1.295243   TRUE
## 4  36699.33333 19444.72315 15152.091814 -1.271429  FALSE
## 5  15567.33333 12216.06745  1859.435488  1.331264   TRUE
## 6     12.33333    59.80803    14.011900  2.329705  FALSE

The output has columns as follows:

window - the name of the window with data on this row
t - the value of the t statistic for the first (non-bootstrap) iteration
p_value - the computed p value for the window
fdr - the false detection rate at this window
mean_count_a - the mean count for treatment ‘a’
mean_count_b - the mean count for treatment ‘b’
sd_a - standard deviation for treatment ‘a’
sd_b - standard deviation for treatment ‘b’
log2_fc - log 2 of the ratio of the mean counts
is_sig - flag showing whether window was significant according to the level set in the function with parameter fdr_level

To analyse all treatments against a common comparison at once you can use the wrapper function estimate_fdr_multiclass() which requires the name of the common comparison treatment

multi_result <-  estimate_fdr_multiclass(normalized_counts,
              common_control = "control",
              iterations = 100000,
              fdr_level = 0.01)

head(multi_result)

##                  window         t p_value       fdr mean_count_a
## 1    synth_chrom:1-50:- -1.938336     0.0 0.0000000    5632.6667
## 2  synth_chrom:51-100:+ -1.218827     0.0 0.0000000    9758.6667
## 3 synth_chrom:101-150:-  4.107091     0.0 0.0000000     205.3333
## 4 synth_chrom:151-200:- -1.510404     0.1 0.1689655   15202.6667
## 5 synth_chrom:251-300:-  3.308530     0.0 0.0000000   39171.0000
## 6 synth_chrom:301-350:-  1.400435     0.1 0.1689655      62.0000
##   mean_count_b        sd_a         sd_b   log2_fc is_sig         a       b
## 1  15382.66667  6377.90407  5935.290164 -1.449416   TRUE treatment control
## 2  20613.66667  4435.08403 14774.507279 -1.078845   TRUE treatment control
## 3     83.66667    50.46121     9.291573  1.295243   TRUE treatment control
## 4  36699.33333 19444.72315 15152.091814 -1.271429  FALSE treatment control
## 5  15567.33333 12216.06745  1859.435488  1.331264   TRUE treatment control
## 6     12.33333    59.80803    14.011900  2.329705  FALSE treatment control

The results here has two extra columns:

a - the name of the treatment
b - the name of the common control

Bayes Factor Analysis

A similar pair of functions is available for Bayes factor analysis. estimate_bayes_factor() for the two-way comparison. The factor argument sets the Bayes factor at which to mark the window as having different counts.

result_bf <-  estimate_bayes_factor(normalized_counts,
                           treatment_a =  "treatment",
                           treatment_b = "control",
                                factor = 2.0)

head(result_bf)

##                  window bayes_factor is_sig mean_count_a mean_count_b
## 1    synth_chrom:1-50:-   0.20483396  FALSE    5632.6667  15382.66667
## 2  synth_chrom:51-100:+  -0.20139051  FALSE    9758.6667  20613.66667
## 3 synth_chrom:101-150:-   1.39100629  FALSE     205.3333     83.66667
## 4 synth_chrom:151-200:-  -0.04237361  FALSE   15202.6667  36699.33333
## 5 synth_chrom:251-300:-   0.98251046  FALSE   39171.0000  15567.33333
## 6 synth_chrom:301-350:-  -0.10371176  FALSE      62.0000     12.33333
##          sd_a         sd_b   log2_fc
## 1  6377.90407  5935.290164 -1.449416
## 2  4435.08403 14774.507279 -1.078845
## 3    50.46121     9.291573  1.295243
## 4 19444.72315 15152.091814 -1.271429
## 5 12216.06745  1859.435488  1.331264
## 6    59.80803    14.011900  2.329705

Again, a estimate_bayes_factor_multiclass() function works for all comparisons to a common control.

The results data frame is similar to that from the Bootstrap t methods, with a factor column in place of the t and fdr columns.

Differentially accessible or expressed windows

Dan MacLean

2018-03-21

Bootstrap t-tests

Bayes Factor Analysis