|
|
BIOST 2025: Biostatistics Seminar Notices
Seminar Notices Spring Term 2009
Seminar Notices Fall Term 2008
Seminar Speakers
Spring
Term 2009
Brian S. Caffo, February 5, 2009
Huixia Judy Wang, February 12, 2009
Andriy Bandos, February 26, 2009
Xing Yuan, March 5, 2009
Sunghee Oh, March 5, 2009
Minjae Lee, March 5, 2009
Kwonho Jeong, March 5, 2009
Ya-Hsiu Chuang, March 5, 2009
Sachiko Miyahara, March 5, 2009
Enrique Schisterman, March 26, 2009
Jiashun Jin, April 2, 2009
John Klein, April 9, 2009
SEMINAR
DATE: Thursday, February 5, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: BRIAN S. CAFFO,
Associate Professor,
Department of Biostatistics,
Johns Hopkins Bloomberg School of Public Health
TOPIC: A Case Study in Pharmacologic Colon Imaging Using Principal
Curves in Single Photon Emission Computed Tomography |
In this talk we consider functional imaging of the colon to assess the
kinetics of a microbicide lubricant. The overarching goal is to
understand the penetration of the lubricant after anal coitus. Such
information is crucial for understanding the potential impact of the
microbicide on viral transmission. The experiment was conducted by
simulating coitus in a subject after injecting a radiolabeled
lubricant. After coital simulation, the subject was imaged via Single
photon emission computed tomography (SPECT), a non-invasive, in-vivo
functional imaging technique. We use a highly modified version of the
principal curve algorithm to construct a three dimensional curve
through the colon images. The algorithm is developed on several
difficult two dimensional images of familiar curves. The final curve
fit the colon data is compared to experimental sigmoidoscope
collection.
|
Return
to top
SEMINAR
DATE: Thursday, February 12, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Huixia Judy Wang,
Assistant Professor,
Department of Statistics,
North Carolina State University
TOPIC: Censored Quantile Regression
|
Censored quantile regression offers a valuable supplement to Cox
proportional hazards model for survival analysis. Existing work in the
literature often requires stringent assumptions, such as unconditional
independence of the survival time and the censoring variable or global
linearity at all quantile levels. Moreover, some of the work uses recursive
algorithms which makes it challenging to derive asymptotic normality. To
overcome these drawbacks, we propose a novel locally weighted censored
quantile regression approach. The new approach adopts the
redistribution-of-mass idea and employs a local reweighting scheme. Its
validity only requires conditional independence of the survival time and
the censoring variable given the covariates, and linearity at the
particular quantile level of interest. Our method leads to a simple
algorithm that can be conveniently implemented with R software. Applying
recent theory of M-estimation with infinite dimensional parameters, we
rigorously establish the consistency and asymptotic normality of the
proposed estimator. The proposal method is studied via simulations and the
analysis of an acute myocardial infarction dataset.
|
Return
to top
SEMINAR
DATE: Thursday, February 26, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Andriy Bandos,
Assistant Professor,
Department of Biostatistics,
University of Pittsburgh
TOPIC: COMPARISON OF DIAGNOSTIC TESTS WITH BINARY DATA: SOME SHORTCUTS
|
Development of more accurate and effective diagnostic systems is an important problem in many fields. Although systems for detection of presence/absence of the abnormality of interest often provide results on a multi-category scale, the simplest and often a more clinically relevant scale is binary. When a new diagnostic system is being developed, the initial phase of the assessment often aims to substantially increase sensitivity of a binary test even at the acknowledged cost of decreasing specificity below conventional level. This is followed by a tuning phase in which the goal is to remove as many false positive findings as possible with minimal reduction in sensitivity.
If the characteristics of a new binary diagnostic test are in the acceptable range, one can assess the improvements over the conventional diagnostic test with help of expected utilities. Alternatively, if the ROC curve of a new diagnostic system is available, one can evaluate whether by tuning the new system it is possible to achieve an objective improvement over the conventional diagnostic test. However, both expected utility and ROC curve approaches are associated with additional expenses and possible loss of reliability of the inferences. Expected utility approach often requires specification of a typically subjective and difficult-to-deduce utility function. ROC technique requires conducting an ROC study which in some cases may cast doubts on validity and usefulness of the resulting ROC curves.
We discuss the often overlooked scenarios that enable objective comparison of two binary diagnostic tests, one of which has higher sensitivity but lower specificity, without the need to specify a utility function or conduct an ROC study. The presented approach is conveniently formulated in terms of likelihood ratios, and for ROC inferences it exploits the assumption of concavity of the ROC curve which is often justified, particularly in the case when human observer interpretation is an integral part of diagnostic result.
|
Return
to top
SEMINAR
DATE: Thursday, March 5, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: ENAR Student Presentations, Xing Yuan, Department of Biostatistics, University of Pittsburgh
TOPIC: A Meta-analytic Framework for Combining Incomparable Cox Proportional
Hazard Models Caused by Omitting Important Covariates
|
In Cox proportional hazard models with censored survival data, estimates of
treatment effects with some important covariates omitted will be biased
toward zero (Gail et al., 1984). This is especially problematic in
meta-analyses to combine estimates of parameters from studies where
different covariate adjustments are made. Presently, few constructive
solutions have been provided to address this issue. We propose a
meta-analytic ramework for combining incomparable Cox models under both
aggregated patient data (APD) and individual patient data (IPD) structures.
For APD, two meta-regression models with indicators of different covariates
in Cox models are proposed to adjust the heterogeneity of treatment effects
across studies. Both parametric and nonparametric estimators for the pooled
treatment effect and the heterogeneity variance are presented and compared.
For IPD, we propose a fully augmented weighted estimator based on frailty
models accommodating covariate(s) omission from different studies, and
results are compared with estimations from multiple imputations method. We
illustrate the advantages of our proposed analytic procedures over existing
methodologies by simulation studies.
|
Return
to top
SEMINAR
DATE: Thursday, March 5, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: ENAR Student Presentations, Sunghee Oh, Department of Biostatistics, University of Pittsburgh
TOPIC: Effects of missing value imputation on down-stream analyses in the
microarray data |
Amongst the high-throughput technologies, DNA microarray experiments
provide enormous quantity of genes and arrays with biological information
to disease. Despite advances and the popular usage of microarray, the
microarray experiments frequently produce multiple missing values due to
many flaw factors. Thus, gene expression data contains some missing entries
and a large number of genes may be affected. Many downstream algorithms for
gene expression analysis require a complete matrix as an input. For now,
there exists no uniformly superior imputation method and the performance
depends on the structure and nature of data set. In addition, imputation
methods have been mostly compared in terms of variants of RMSEs (Root Mean
Squared Error) which compare true expression values to imputed values. The
drawback of RMSE-based evaluation is that the measure does not reflect the
true biological effect in down-stream analyses. In this study, we
investigate how missing value imputation process affects the biological
results of differentially expressed genes discovery, clustering and
classification. Quantitative measures reflecting the true biological
effects in each down-stream analysis will be used to evaluate imputation
methods and compared to RMSE-based evaluation.
|
Return
to top
SEMINAR
DATE: Thursday, March 5, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: ENAR Student Presentations, Minjae Lee, Department of Biostatistics, University of Pittsburgh
TOPIC: A multiple imputation approach for left-censored biomarkers with limits of
detection |
We often encounter left-censored biomarker measurements subject to the
limits of detection (LOD). Ignoring or replacing the censored observations
with naive imputation method lead to biased estimates of the parameters in
the regression analysis. Maximum likelihood methods have been developed
when the distribution of the biomarkers are assumed normal. However, the
computation can be very intensive or even prohibitive as the number of
censored biomarkers increases. Motivated by a sepsis study, where a panel
of biomarkers were measured to investigate the association between the
sepsis and the biomarkers such as cytokines and coagulation markers, we
propose a multiple imputation(MI) approach based on Tobit regression and
Gibbs sampling. We conduct simulation study to evaluate the performance of
our MI approach and use a sepsis dataset for demonstration.
|
Return
to top
SEMINAR
DATE: Thursday, March 5, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: ENAR Student Presentations, Kwonho Jeong, Department of Biostatistics, University of Pittsburgh
TOPIC: Estimation and comparison of the predictiveness curve for repeated measures
design
|
In the Genetic and Inflammatory Marker of Sepsis (GenIMS) study (a large
multicenter cohort study), a number of pro-inflammatory and
anti-inflammatory continuous biomarkers associated with sever sepsis and
death have been measured longitudinally. In this work, we are proposing to
extend the theory of the predictiveness curve (PC) that has been developed
by Huang, Pepe and Feng (Bcs2007) for longitudinally measured continuous
biomarkers data. We fitted the PC using longitudinally measured GenIMS
biomarker data for comparison of their effectiveness in predicting the risk
of death. The PC has provided a common scale (zero to one) across various
markers for comparing the usefulness of a given marker relative to other
potential markers. Using this graphical tool, we have compared population
distribution of risk of death for a number of competitive biomarkers
associated with the disease. An extensive simulation study has been
undertaken to establish the properties of the proposed methods under
differing scenarios.
|
Return
to top
SEMINAR
DATE: Thursday, March 5, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: ENAR Student Presentations, Ya-Hsiu Chuang, Department of Biostatistics, University of Pittsburgh
TOPIC: Bayesian Model Averaging Approach in Health Effects Studies |
Determining the lagged effects of ambient air levels of a pollutant on
cardiac distress is important in health effect studies. Standard model
selection procedures where a set of predictor variables is selected ignore
the associated uncertainties and may lead to overestimation of effects.
Bayesian model averaging approach takes account of model uncertainty by
combining information from all possible models. Zellner’s g-prior
containing a hyperparameter g can account for model uncertainty and has
potential usefulness in this endeavor. We present results from a
sensitivity analysis for Bayesian model averaging with different calibrated
hyperparameter g, viz., Akaike Information Criterion prior, Bayes
Information Criterion prior, and Local Empirical Bayes estimate. Data from
Allegheny County Air Pollution Study and the simulated data sets are used.
|
Return
to top
SEMINAR
DATE: Thursday, March 5, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: ENAR Student Presentations, Sachiko Miyahara, Department of Biostatistics, University of Pittsburgh
TOPIC: Weighted Kaplan-Meier estimator for two-stage treatment regimes |
In two stage randomization designs, patients are randomized to one of the
initial treatments, and at the end of the first stage, they are randomized
to one of the second stage treatments depending on the outcome of the
initial treatment. Statistical inference for survival data from these
trials uses methods such as marginal mean models and weighted risk set
estimates. In this article, we propose a weighted Kaplan-Meier (WKM)
estimator based on the method of inverse-probability weighting and compare
its properties to that of the standard Kaplan-Meier (SKM) estimator,
marginal mean model based (MM) estimator and weighted risk set (WRS)
estimator. Simulation study reveals that the WKM estimator is
asymptotically unbiased, and provides coverage rates similar to that of MM
and WRS estimators. The SKM estimator, however, is biased when the second
randomization rates are not same for the responders and non-responders to
initial treatment. The methods described are demonstrated by applying to a
leukemia dataset.
|
Return
to top
SEMINAR
DATE: Thursday, April 2, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Jiashun Jin,
Associate Professor,
Department of Statistics,
Carnegie Mellon University
TOPIC: Higher Criticism Thresholding: Optimal Feature Selection when Useful Features
are Rare and Weak
|
|
Return
to top
SEMINAR
DATE: Thursday, April 9, 2009
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: John P. Klein,
Director and Professor,
Department of Population Health,
Division of Biostatistics,
Medical College of Wisconsin
TOPIC: Direct Regression Models for Survival Parameters Based on Pseudo-Values |
We investigated the use of pseudo-values from a jackknife statistic constructed from a simple summary statistic as a way of developing direct
regression models of survival parameters. These pseudo-values, based on the
difference between the complete sample and leave-one-out estimator, are
used in a generalized estimating equation to obtain estimates of model
parameters. The approach can be applied to direct regression modeling of
the survival function over time, the cumulative incidence function for
competing risk data, the restricted mean survival time, the mean quality of
life, and the probabilities in a multistate model.
|
Return
to top
Seminar Speakers
Fall Term 2008
Sounak Chakraborty, September 18, 2008
Tianxi Cai, October 3, 2008
Jae Won Lee, October 23, 2008
Robert Lyles, November 06, 2008
Mai Zhou, November 13, 2008
Eleanor Feingold, November 20, 2008
SEMINAR
DATE: Thursday, September 18, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Sounak Chakraborty, Assistant Professor, Department of Biostatistics,
University of Missouri-Columbia
TOPIC: Gene Expression-Based Glioma Classification Using Hierarchical Bayesian Kernel Machine Models
|
In modern clinical neuro-oncology, the diagnosis and classification of malignant gliomas remains problematic and effective therapies are still elusive. As patient prognosis and therapeutic decisions rely on accurate pathological grading or classification of tumor cells, extensive investigation is going on for accurately identifying the types of glioma cancer. Unfortunately, many malignant gliomas are diagnostically challenging; these non-classic lesions are difficult to classify by histological features, thereby resulting in considerable interobserver variability and limited diagnosis reproducibility. In recent years, there has been a move towards the use of cDNA microarrays for tumor classification. These high-throughput assays provide relative mRNA expression measurements simultaneously for thousands of genes. A key statistical task is to perform classification via different expression patterns. Gene expression profiles may offer more information than classical morphology and may provide a better alternative to the classical tumor diagnosis schemes. The classification becomes more difficult when there are more than two cancer types, as with glioma.
This paper considers several Bayesian classification methods for the analysis of the glioma cancer with microarray data based on reproducing kernel Hilbert space under the multiclass setup. We consider the multinomial logit likelihood as well as the likelihood related to the muliclass Support Vector Machine (SVM) model. It is shown that our proposed Bayesian classification models with multiple shrinkage parameters can produce much accurate classification scheme for the glioma cancer compared to the several existing classical methods. We have also proposed a Bayesian variable selection scheme for selecting the differentially expressed genes integrated with our model. This integrated approach improves classifier design by yielding simultaneous gene selection.
|
Return
to top
SEMINAR
DATE: Friday, October 3, 2008
TIME: 3:00p.m.,
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Tianxi Cai,
Associate Professor,
Department of Biostatistics,
Harvard University
TOPIC: Evaluating Risk Stratification Rules and the Role of New Biomarkers in Risk Stratification
|
Accurate risk prediction is an important step in developing optimal
strategies for disease prevention and treatment. Based on the predicted
risks, patients can be stratified to different risk categories where each
category corresponds to a particular clinical intervention. Incorrect or
sub-optimal interventions are likely to result in unnecessary financial
and medical consequences. It is thus essential to account for the costs
associated with the clinical interventions when developing and evaluating
risk stratification rules for clinical use. In this article, we propose to
quantify the value of a risk stratification rule based on the total
expected cost attributed to incorrect assignment of risk groups due to the
rule. For any given set of cost parameters, we develop an optimal
stratification rule that minimizes the total expected cost over the entire
population of interest. Statistical inference procedures are developed for
evaluating and comparing risk stratification rules and examined through
simulation studies. When new biomarkers become available, it is crucial to
evaluate the incremental value of the new biomarkers in risk
stratification. We propose robust procedures for evaluating such
incremental values within various sub-populations. The proposed procedures
are illustrated with an example from the Cardiovascular Health Study.
|
Return
to top
SEMINAR
DATE: Thursday, October 23, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Jae Won Lee, PhD, Professor,
Department of Statistics,
College of Political Science & Economics,
Korea University, and
Visiting Professor,
Department of Biostatistics,
University of Pittsburgh ,
TOPIC: ISSAC : An Integrated Statistical System for Analyzing DNA Chip data
|
The remarkably increasing rate at which genomes are being sequenced has opened a new area of genome research, functional genomics, which is concerned with assigning biological function to DNA sequences. The novel biotechnologies such as cDNA microarrays and oligonucleotide chips are increasingly used to exploit DNA sequence data and yield information on the gene expression levels for entire genomes.
Many statisticians have developed some statistical methods for designing and analyzing microarray gene expression data. However, most of the commercial softwares do not fully cover these statistical methods but only emphasize the visualization tools and user-friendly operating tools. It is still not easy for the biologists to use the appropriate statistical software and interpret the results. Thus, we developed an integrated statistical software system for analyzing DNA chip data, ISSAC. It is as user-friendly as the other commercial softwares and implements many recently proposed statistical methods.
|
Return
to top
SEMINAR
DATE: Thursday, November 06, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Robert Lyles,
Associate Professor,
Director of Graduate Studies,
Department of Biostatistics and Bioinformatics,
Rollins School of Public Health,
Emory University
TOPIC: Associating Health Outcomes With Latent Subject-Specific Characteristics Based on Left- or Interval-Censored Longitudinal Exposure Data |
Random effects in models for longitudinal data often provide meaningful subject-specific measures of exposure that might be associated with health-related outcomes. This motivates two-stage or, ideally, unified models to tie together the outcome and exposure information. In the two-stage approach, interest lies in the properties of predictors of random effects and their relative performances as covariates at the second stage. While more challenging, a unified modeling approach arguably provides an inherent and efficient adjustment for covariate measurement error. Either approach can face complications, however, when the exposure data are subject to detection limits, are coarse due to rounding, or are otherwise interval-censored. We consider an application in environmental and reproductive health that motivates likelihood-based approaches to handling censored exposure data and linking them to outcomes. We assess the use of empirical Bayes and empirical constrained Bayes predictions at the second stage, and compare the resulting estimated parameters of interest from the health effects model with those obtained under a joint modeling approach.
KEY WORDS: Detection limits, Environmental epidemiology, Measurement error, Prediction, Random effects, Reproductive health |
Return
to top
SEMINAR
DATE: Thursday, November 13, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Mai Zhou,
Professor,
Department of Statistics,
University of Kentucky
TOPIC: Empirical Likelihood and Survival Analysis
|
Since the pioneer work of Thomas and Grunkemeier (1975) and Owen (1988),
empirical likelihood has been developed into a powerful nonparametric
inference approach and become popular in statistical literature.
There are many applications of empirical likelihood in survival
analysis. In this talk, we will first present a quick introduction
to the idea of empirical likelihood and then an
overview of recent developments of empirical likelihood methods for
survival data.
In particular, we discuss empirical likelihood results for a general
mean functional of the distribution function, a general functional
of the hazard, the Cox proportional hazards model, and semiparametric
accelerated failure time (AFT) models.
Examples will be given illustrating the use of the R package emplik to
carry out empirical likelihood ratio test with censored survival data. |
Return
to top
SEMINAR
DATE: Thursday, November 20, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Eleanor Feingold,
Associate Professor,
Department of Human Genetics and Biostatistics,
University of Pittsburgh ,
TOPIC: Statistical Issues in Genetic Association Studies
|
New genomic technologies have made it possible (if not quite affordable) to conduct genetic association studies on a very large scale – up to a million genetic markers spanning the genome. But despite the growing number of genome-wide association studies, small-scale candidate gene studies still play a critical role in genetic epidemiology. Candidate gene studies are sometimes preferred due to cost, and sometimes because there is particularly strong prior information about candidate genes. In addition, replication studies for genome-wide associations are essentially candidate gene studies. Thus methods for improving the power and performance of candidate gene studies are critical. In this talk I will discuss the relationship between genome-wide association studies and candidate gene studies, and I will survey several areas in which I think there are important statistical problems that have not gotten enough attention. I will give some illustrative research results and discuss some open questions. |
Return
to top
|
 |