Title: | Structural Bayesian Information Criterion for Graphical Models |
---|---|
Description: | This is the implementation of the novel structural Bayesian information criterion by Zhou, 2020 (under review). In this method, the prior structure is modeled and incorporated into the Bayesian information criterion framework. Additionally, we also provide the implementation of a two-step algorithm to generate the candidate model pool. |
Authors: | Quang Nguyen [cre, aut]
|
Maintainer: | Quang Nguyen <[email protected]> |
License: | GPL-3 |
Version: | 1.0.0 |
Built: | 2025-02-03 05:25:10 UTC |
Source: | https://github.com/cran/SBICgraph |
This is the esnrichment step in the two-step algorithm to construct the model pool (internal use only)
addition(data, lambda, P)
addition(data, lambda, P)
data |
An |
lambda |
Vector of tuning parameter |
P |
Prior adjacency matrix |
A list of model objects
Jie Zhou
Comparing the two adjacency matrices for false discovery rate and positive selection rate. Used for model validation
comparison(real, estimate)
comparison(real, estimate)
real |
The real matrix |
estimate |
The estimated matrix |
A list of the following evaluation metrics
PSR |
Positive Selection Rate |
FDR |
False Discovery rate |
Jie Zhou
This is the pruning step in the two-step algorithm to construct the model pool (internal use only)
deletion(data, lambda, P)
deletion(data, lambda, P)
data |
An |
lambda |
Vector of tuning parameter |
P |
Prior adjacency matrix |
A list of model objects
Jie Zhou
This function find the maximum likelihood estimate of the precision matrix with given adjacency matrix for multivariate normal distribution.
mle(data, priori)
mle(data, priori)
data |
An |
priori |
A |
The methods are based on the relationship between precision matrix of the multivariate normal distribution and regression coefficients.
Returns a p
by p
matrix estimate of the precision matrix
Jie Zhou
set.seed(1) d=simulate(n=100,p=200, m1=100, m2=30) data=d$data priori=d$realnetwork precision=mle(data=data,priori=priori)
set.seed(1) d=simulate(n=100,p=200, m1=100, m2=30) data=d$data priori=d$realnetwork precision=mle(data=data,priori=priori)
For a given prior graph, the two-step algorithm, including edge enrichment and pruning, is used to construct the model pool
modelset(data, lambda, P)
modelset(data, lambda, P)
data |
A |
lambda |
Tuning parameter vector |
P |
Prior adjacency matrix |
A list including all the candidate models in the model pool.
Each model is represented by a p
by p
adjacency matrix
Jie Zhou
set.seed(1) d=simulate(n=100, p=100, m1 = 100, m2 = 30) data=d$data P=d$priornetwork lambda=exp(seq(-5,5,length=100)) candidates=modelset(data=data,lambda=lambda, P=P)
set.seed(1) d=simulate(n=100, p=100, m1 = 100, m2 = 30) data=d$data P=d$priornetwork lambda=exp(seq(-5,5,length=100)) candidates=modelset(data=data,lambda=lambda, P=P)
This function estimates the novel structural Bayesian information criterion given the data and a given graph structure
sbic(data, theta, prob, P)
sbic(data, theta, prob, P)
data |
A |
theta |
The |
prob |
The expected error rate |
P |
The prior adjacency matrix |
The value of sbic with given temperature parameter and prior adjacency matrix
Jie Zhou
set.seed(1) d=simulate(n=100, p=100, m1 = 100, m2 = 30) data=d$data P=d$priornetwork theta=d$realnetwork prob=0.15 index=sbic(data=data, theta=theta, prob=prob, P=P)
set.seed(1) d=simulate(n=100, p=100, m1 = 100, m2 = 30) data=d$data P=d$priornetwork theta=d$realnetwork prob=0.15 index=sbic(data=data, theta=theta, prob=prob, P=P)
Select the model based on the SBIC criterion and the two-step algorithm
sggm(data, lambda, M, prob)
sggm(data, lambda, M, prob)
data |
An n by p dataframe representing the observations |
lambda |
A vector of tuning parameters used to build the model pool |
M |
The prior adjacency matrix |
prob |
The mean error rate |
A list of objects containing:
networkhat |
The final selected adjacency matrix |
candidates |
The model pool |
Jie Zhou
set.seed(1) m1 = 100 m2 = 30 p = 100 n = 100 d=simulate(n=n,p=p, m1 = m1, m2 = m2) # simulate fake data lambda=exp(seq(-5,5,length=100)) # tuning parameter data=d$data # data from the simulation M=d$priornetwork # prior network from simulation # calculating the error rate r1=m2/m1 r2=m2/(p*(p-1)/2-m1) r=(r1+r2)/2 # apply sggm result=sggm(data=data, lambda=lambda, M=M, prob=r) # compare the final network and the true network result$networkhat d$realnetwork
set.seed(1) m1 = 100 m2 = 30 p = 100 n = 100 d=simulate(n=n,p=p, m1 = m1, m2 = m2) # simulate fake data lambda=exp(seq(-5,5,length=100)) # tuning parameter data=d$data # data from the simulation M=d$priornetwork # prior network from simulation # calculating the error rate r1=m2/m1 r2=m2/(p*(p-1)/2-m1) r=(r1+r2)/2 # apply sggm result=sggm(data=data, lambda=lambda, M=M, prob=r) # compare the final network and the true network result$networkhat d$realnetwork
According to a given edge density, first generate the adjacency matrix P of a graph. Based on P, the simulated multivariate normal data is generated with mean zero and a specified given precision matrix
simulate(n, p, m1, m2)
simulate(n, p, m1, m2)
n |
Sample size |
p |
The number of vertices in graph or the number of variables |
m1 |
The number of edges in the true graph |
m2 |
The number of elements in adjacency matrix that stay in different states, i.e., 0 or 1, in true and prior graphs |
A list including the simulated data, real adjacency matrix and a prior adjacency matrix
data |
simulated data |
realnetwork |
real adjacency matrix |
priornetowrk |
prior adjacency matrix |
Jie Zhou
set.seed(1) d=simulate(n=100,p=200, m1=100, m2=30) d$data d$realnetwork d$priornetwork
set.seed(1) d=simulate(n=100,p=200, m1=100, m2=30) d$data d$realnetwork d$priornetwork