-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
189 lines (162 loc) · 5.85 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
set.seed(1999)
knitr::opts_chunk$set(
collapse = TRUE,
comment = '#>',
fig.path = 'man/figures/README-',
out.width = '60%',
dpi = 300,
fig.align = 'center',
warning = FALSE
)
knitr::knit_hooks$set(imgcenter = function(before, options, envir){
if (before) {
htmltools::HTML('<p align="center">')
} else {
htmltools::HTML('</p>')
}
})
library(badger)
library(copre)
library(ggplot2)
```
# CopRe Tools for Nonparametric Martingale Posterior Sampling
<!-- badges: start -->
```{r badges, echo = FALSE, results='asis'}
cat(
badge_cran_release('copre', 'green'),
badge_devel('blakemoya/copre', 'blue')
)
```
[![R-CMD-check](https://github.com/blakemoya/copre/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/blakemoya/copre/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
A set of tools for Bayesian nonparametric density estimation using Martingale
posterior distributions including the **Cop**ula **Re**sampling (CopRe)
algorithm. Also included are a Gibbs sampler for the marginal Gibbs-type mixture
model and an extension to include full uncertainty quantification via a
predictive **Seq**uence **Re**sampling (SeqRe) algorithm. The CopRe and SeqRe
samplers generate random nonparametric distributions as output, leading to
complete nonparametric inference on posterior summaries. Routines for
calculating arbitrary functionals from the sampled distributions are included as
well as an important algorithm for finding the number and location of modes,
which can then be used to estimate the clusters in the data.
## Installation
You can install the latest stable development version of CopRe from
[GitHub](https://github.com/blakemoya/copre) with:
``` r
# install.packages("devtools")
devtools::install_github("blakemoya/copre")
```
## Usage
### CopRe
The basic usage of CopRe for density estimation is to supply a data vector, a
number of forward simulations per sample, and a number of samples to draw:
```{r, eval = FALSE}
library(copre)
data <- sample(c(rnorm(100, mean = -2), rnorm(100, mean = 2)))
res_cop <- copre(data, 250, 100)
plot(res_cop) +
geom_function(
fun = function(x) (dnorm(x, mean = -2) + dnorm(x, mean = 2)) / 2
)
```
```{r example_copre, echo = FALSE, imgcenter=TRUE}
data <- sample(c(rnorm(100, mean = -2), rnorm(100, mean = 2)))
res_cop <- copre(data, 250, 100)
plot(res_cop) +
geom_function(
fun = function(x) (dnorm(x, mean = -2) + dnorm(x, mean = 2)) / 2
)
```
Currently only a Gaussian kernel copula is supported but more options are to be
added in future versions.
### SeqRe
Using SeqRe first involves specifying a model via a base measure `G` and an
exchangeable sequence measure `Sq`. Here we can set up a normal location scale
mixture with components assigned according to a Dirichlet process.
We can then fit the marginal mixture model with the `gibbsmix` MCMC routine and
then extend the results to their full nonparametric potential via `seqre`.
```{r, eval = FALSE}
b_norm <- G_normls()
s_dp <- Sq_dirichlet()
res_seq <- gibbsmix(data, 100, b_norm, s_dp) |> seqre()
plot(res_seq) +
geom_function(
fun = function(x) (dnorm(x, mean = -2) + dnorm(x, mean = 2)) / 2
)
```
```{r example_seqre, echo = FALSE, imgcenter = TRUE}
b_norm <- G_normls()
s_dp <- Sq_dirichlet()
res_seq <- gibbsmix(data, 100, b_norm, s_dp) |> seqre()
plot(res_seq) +
geom_function(
fun = function(x) (dnorm(x, mean = -2) + dnorm(x, mean = 2)) / 2
)
```
### Utilities
The moments of the estimated distributions can be obtained by calling
`moment`, and arbitrary functionals of interest can be obtained similarly with
`functionals`.
```{r, eval = FALSE}
moms <- data.frame(Mean = moment(res_cop, 1),
Variance = moment(res_cop, 2))
with(moms,
qplot(Mean, Variance) +
theme_bw()
)
```
```{r moment, echo = FALSE, imgcenter=TRUE}
moms <- data.frame(Mean = moment(res_cop, 1),
Variance = moment(res_cop, 2))
with(moms,
qplot(Mean, Variance) +
theme_bw()
)
```
The function `modes` can be used to isolate local maxima and minima in the
density estimates. Here we can see the detected modes in black and the antimodes
in red. Experiment with using sample antimodes as a distribution over cluster
boundaries!
```{r eval = FALSE, imgcenter = TRUE}
res_cop.dens <- grideval(res_cop)
ns <- modes(res_cop.dens, idx = TRUE)
us <- antimodes(res_cop.dens, idx = TRUE)
p <- plot(res_cop.dens)
for (i in 1:length(res_cop)) {
p <- p +
geom_point(data = data.frame(x = ns[[i]]$value,
y = res_cop.dens[i, ns[[i]]$idx]),
aes(x = x, y = y), shape = 24, size = 0.5, alpha = 0.5) +
geom_point(data = data.frame(x = us[[i]]$value,
y = res_cop.dens[i, us[[i]]$idx]),
aes(x = x, y = y), shape = 25, size = 0.5, alpha = 0.5,
color = 'red')
}
print(p)
```
```{r modes, echo = FALSE, imgcenter = TRUE}
res_cop.dens <- grideval(res_cop)
ns <- modes(res_cop.dens, idx = TRUE)
us <- antimodes(res_cop.dens, idx = TRUE)
p <- plot(res_cop.dens)
for (i in 1:length(res_cop)) {
p <- p +
geom_point(data = data.frame(x = ns[[i]]$value,
y = res_cop.dens[i, ns[[i]]$idx]),
aes(x = x, y = y), shape = 24, size = 0.5, alpha = 0.5) +
geom_point(data = data.frame(x = us[[i]]$value,
y = res_cop.dens[i, us[[i]]$idx]),
aes(x = x, y = y), shape = 25, size = 0.5, alpha = 0.5,
color = 'red')
}
print(p)
```
## Feedback
If you have encounter any bugs or other problems while using CopRe, let me know
using the Issues tab. For new feature requests, contact me via email at
[blakemoya@utexas.edu](mailto:blakemoya@utexas.edu?subject=[GitHub:%20CopRe]%20Feature%20Request).