University of Pittsburgh
            Site Map | Find People
 
 

Welcome
Overview
FACULTY & STAFF
Faculty
Faculty Position(s)
Administrative Staff
ACADEMICS
Academic Programs
Requirements
Frequent Questions
Course Offerings
Seminars
    - Seminar Notices
Admission Procedures
Financial Aid
Statistical Genetics
STUDENTS & ALUMNI
Student Information
Alumni
Consulting Service
RESEARCH
Active Research
Funded Projects
Faculty Publications


RESOURCES

Computing Resource

 

BIOST 2025: Biostatistics Seminar Notices


Seminar Notices Spring Term 2008
Seminar Notices Fall Term 2007


Seminar Speakers
Spring Term 2008

Daniel Nagin, January 17, 2008
Taeyoung Park , January 31, 2008
Zhezhen Jin, February 21, 2008
Fiona Callaghan, February 28, 2008
Chunrong Cheng, February 28, 2008
Sarah Haile, February 28, 2008
Jia Li , February 28, 2008
Tao Song, February 28, 2008
Rick Blakesley, February 28, 2008
Peter F. Thall, March 6, 2008
Hongwei Zhao, March 27, 2008
Mitchell H. Gail, April 3, 2008


SEMINAR

DATE: Thursday, January 17, 2008
TIME:
3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Daniel Nagin, Teresa and H. John Heinz III Professor
of Public Policy and Statistics, Carnegie Mellon University
TOPIC: THE RELATIONSHIP BETWEEN FIRST IMPRISONMENT AND CRIMINAL CAREER DEVELOPMENT: A MATCHED SAMPLES COMPARISON

Using data from the Netherlands-Based Criminal Career and Life-course Study we examine the effect of first-time imprisonment between age18-38 on the conviction rates in the three years immediately following the year of the imprisonment. Unadjusted comparisons of those imprisoned and those not imprisoned will be biased because imprisonment is not meted out randomly. Selection processes will tend to make the imprisoned group
disproportionately crime prone compared to the not imprisoned group. In
this study we combine group-based trajectory modeling with risk set
matching to balance a variety of measurable indicators of criminal propensity. We find that first-time imprisonment is associated with an
increase in criminal activity in the three years following release. The effect of imprisonment is similar across offence types.

Return to top


SEMINAR

DATE: Thursday, January 31, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Taeyoung Park, Assistant Professor,Department of Statistics,University of Pittsburgh
TOPIC: A Melodious Harmony of Dissonance: Efficiency of Incompatibility in Partially Collapsed Gibbs Samplers

Ever increasing computational power along with ever more sophisticated
statistical computing techniques is making it possible to fit ever more
complex statistical models. Among the popular, computationally intensive
methods, the Gibbs sampler (Geman and Geman 1984) has been spotlighted because of its simplicity and power to effectively generate samples from a high-dimensional probability distribution. Despite its simple
implementation and description, however, the Gibbs sampler is criticized
for its sometimes slow convergence especially when it is used to fit highly
structured complex models. Here, we present partially collapsed Gibbs
sampling strategies that improve the convergence by capitalizing on a set
of functionally incompatible conditional distributions. Such incompatibility is generally avoided in the construction of a Gibbs sampler because the resulting convergence properties are not well understood. We, however, introduce three basic tools (marginalization, permutation, and trimming) which allow us to transform a Gibbs sampler into a partially collapsed Gibbs sampler with known stationary distribution and faster convergence.

We illustrate our partially collapsed Gibbs sampling strategies by fitting
joint change-point (or joint segmentation) models for Poisson time-series
data from different signals in astrophysics. The change-point models assume that observed data for each signal are generated from an inhomogeneous Poisson process with constant intensity within unknown time blocks. Because the number of time blocks is unknown and depends on change points, the standard Gibbs sampler constructed to fit the models is not computationally feasible. A typical strategy to avoid the infeasible steps in the Gibbs sampler is to marginalize over Poisson intensities depending on the unknown time blocks. Such marginalization, however, results in a set of incompatible conditional distributions, so that the partially collapsed Gibbs sampler should be designed.

Return to top


SEMINAR

DATE: Thursday, February 21, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
SPEAKER: Zhezhen Jin, Associate Professor, Department of Biostatistics,
Mailman School of Public Health, Columbia University
TOPIC: Regression analysis of Censored Data

In this talk, I will present the estimation and inference for right censored data based on semiparametric linear regression model, accelerated failure time (AFT) model, which is of the form of the ordinary linear regression model with the completely unspecified distribution for random errors. Since the existing estimating functions for regression parameters are nonregular, i.e., non-smooth and non-monotone, it is challenging to obtain the point estimation and its variance estimation. I will review recently developed estimation methods and present a user-friendly general S-Plus/R program package implementing these methods along with real examples. Unsolved issues and problems will also be discussed.

Return to top


SEMINAR

DATE: Thursday, February 28, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Fiona Callaghan, University of Pittsburgh
TOPIC: Classification Trees for Survival Data with Competing Risks

Classification trees are the most popular tool for categorizing individuals into groups and subgroups based on particular outcomes of interest. To date, trees have not been developed to deal with survival data involving competing risks. In this study, we propose two classification trees to analyze data with competing risks: a tree that maximizes between-node heterogeneity and a tree that maximizes within-node homogeneity. After we describe the methods used in growing and pruning the trees, we demonstrate and compare their performance with simulations in a variety of competing risk model configurations. We also illustrate their use by analyzing survival data concerning patients who had end-stage liver disease and were on the waiting list to receive a liver transplant.

Return to top


SEMINAR

DATE: Thursday, February 28, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Chunrong Cheng, University of Pittsburgh
TOPIC: Carrying prediction models across microarray data sets generated by different labs and different platforms

Reproducibility of microarray experiment has been greatly improved in the past decade and its application in biomedical research is more and more prevalent. Multiple literatures investigating an identical disease are found with different array platform and implemented in different labs. Similar high disease prediction accuracies are often reported in these studies, however, applying a prediction model established in one study to the other usually generates poor performance. We investigated the application of gene-wise normalization following the commonly practiced global sample-wise normalization. The proposed gene-wise normalization often dramatically increases the prediction accuracies in the cross-dataset prediction. We further propose a bootstrapping and an alternative analytical method to adjust for differential sample ratios of disease groups that may affect the performance of gene-wise normalization. Simulation result and application to three lung cancer data sets show significant and robust improvement of our method. A simple calibration scheme is developed to apply our method to future clinical trials. The number of calibration samples needed is estimated from existing studies and suggested for application to future studies.

Return to top


SEMINAR

DATE: Thursday, February 28, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Sarah Haile , University of Pittsburgh
TOPIC: Parametric Inference for the Cumulative Incidence Function

The cumulative incidence function is of great importance when analyzing
data where competing risks are present. We present a new distribution for
parametric inference on competing risks. As the cumulative incidence
function is used to model a subset of events, it is logical to model them
using a distribution which is improper. The 4-parameter Gompertz
distribution proposed is very flexible and permits several different hazard
shapes, including unimodal, and can be extended to include covariates.  The model is applied to data from National Surgical Adjuvant Breast and Bowel Project breast cancer trial B-14.
.

Return to top


SEMINAR

DATE: Thursday, February 28, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Jia Li, University of Pittsburgh
TOPIC: Meta-analysis for identifying signature genes in the integration of multiple genomic studies

With the availability of tons of expression profiles, the need for meta-analyses to integrate different types of microarray data are obvious. For detection of differentially expressed genes, most of the current efforts are focused on comparing and evaluating gene lists obtained from each individual dataset. The statistical framework is often not rigorously formulated and a real sense of information integration is rarely performed. In this paper, we tackle two often asked biological questions: "Which genes are significant in one or more data sets?" and "Which genes are significant in all data sets?". We illustrated two statistical hypothesis settings and proposed a best weighted statistic and a maximum p-value statistic for the two questions, respectively. Permutation analysis is then applied to control the false discovery rate. The proposed test statistic is shown to be admissible. And we further show the advantage of our proposed test procedures over existing methods by power comparison, simulation study and real data analyses of a multiple-tissue energy metabolism mouse model data and prostate cancer data sets.

Return to top


SEMINAR

DATE: Thursday, February 28, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Tao Song, University of Pittsburgh
TOPIC: A GENERALIZED NONPARAMETRIC APPROACH OF COMPARING FROC SYSTEMS

ROC curve analysis is a widely used method of comparing diagnostic imaging systems. One formulation of the area under the ROC curve is based on the probability of selecting the abnormal subject from a random pair of normal - abnormal subjects. In a Free Response ROC (FROC) process, which requires searching and marking the locations of all suspected abnormalities with a level of suspicion (rating), normal subjects may have multiple false positives and abnormal subjects may have multiple true positives and false positives. We consider a general approach that uses as a summary index the area under an ROC curve derived from an FROC process. The method entails specifying a function that is used to select the abnormal subject from the normal-abnormal pair. A previously proposed index based on the highest rating on a subject can be viewed as a special case of this method. We consider various discriminating functions including average score and stochastic dominance. Simulation studies are conducted to compare the statistical power of these methods to distinguish between two FROC processes.

Return to top


SEMINAR

DATE: Thursday, February 28, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Rick Blakesley , University of Pittsburgh
TOPIC: Considering P-Value Dependence in a Stepwise Multiplicity Adjustment Method

Multiple hypothesis testing with correlated outcomes has proven challenging. Nonparametric multiplicity adjustment methods that incorporate resampling have demonstrated type I error protection and good power, but implementation remains an obstacle. Parametric methods derived from the Bonferroni method have demonstrated power, but conservatively control type I error with increasing correlation coefficients between outcomes. In contrast, methods derived from the Sidak method incorporate correlation coefficients, though with unstable type I error protection. We propose a parametric method that combines and refines elements of existing methods to control type I error while considering correlation. These elements include the Sidak functional form and the Hochberg stepwise component. We also use a refined adjustment component, similar to the Dubey/Armitage-Parmar and R2 Adjustment methods, which incorporates a measure of dependence between the pvalues under the null hypothesis. We conducted a simulation study to estimate the type I error (familywise error) and power rates of the proposed method and ten existing methods across many combinations of simulation trial parameters, with the chosen rejection threshold a = 0.05. The proposed method demonstrated type I error between [0.047, 0.057] across the conditions explored, with power rates similar to the Hommel and step-down minP methods and exceeding all other methods with conservative type I error performance. While not proven to control type I error in a theoretical context, the proposed parametric method has corroborated, through simulation, the desired properties of a multiplicity adjustment method.

Return to top


SEMINAR

DATE: Thursday, March 6, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Peter F. Thall, PhD, Dept. of Biostatistics, Division of Quantitative Sciences, University of Texas,  M.D. Anderson Cancer Center
TOPIC: Patient-Specific Dose-Finding Based On Bivariate Outcomes and Covariates

A Bayesian method for covariate-adjusted (“individualized”) dose-finding based on a bivariate (efficacy, toxicity) outcome is presented. The method extends Thall and Cook (Biometrics 60:684-693, 2004). Implementation requires an informative prior on covariate effects, obtained from historical data or by elicitation. In the underlying probability model, dose and covariate main effects and dose-covariate interactions are included in the linear components of the marginal efficacy and toxicity outcome probabilities. For each of a representative set of covariate vectors, limits on the probabilities of efficacy and toxicity specified by the physician are used to construct bounding functions that are used to determine the acceptability of each dose for each possible covariate vector. The physician also must specify equally desirable target (efficacy, toxicity) probability pairs for a reference patient’s covariates to characterize trade-offs between the two outcomes. Each patient's dose is chosen to optimize the efficacy-toxicity trade-off for his/her specific covariates. Because the selected doses are covariate-specific and the method is sequentially outcome-adaptive, different patients may receive different doses at the same interim point in the trial, and some initially eligible patients may have no acceptable dose. The method is illustrated by application to a phase I/II trial in acute leukemia.

Return to top


SEMINAR

DATE: Thursday, March 27, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Hongwei Zhao, Sc.D., Associate Professor of Biostatistics, University of Rochester Medical Center, Department of Biostatistics and Computational Biology
TOPIC: Regression Analysis of Mean Quality-Adjusted Lifetime with Censored Data


In clinical trials of chronic diseases such as AIDS, cancer or
cardiovascular diseases, it has been realized that it is not enough to
consider only the overall survival time, the quality of life is also
very important.  The quality-adjusted lifetime (QAL) is a measure that
combines both the quantity and quality of a patient's life time and
thus has received more and more attention. Due to the induced
informative censoring problem, the techniques that are commonly used
for analyzing survival time are not valid anymore.  We will propose a
new method for studying the regression problem for the mean QAL when
the data are subject to right censoring. We allow a very general form
for the mean model as a function of covariates.  The applicability of
our method is demonstrated by both simulation experiments and a
data example from a breast cancer clinical trial study.

Return to top


SEMINAR

DATE: Thursday, April 3, 2008
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Mitchell H. Gail, M.D., Ph.D., Senior Investigator, Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute
TOPIC: Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies

Some case-control genome-wide association studies (CCGWASs) select promising single nucleotide polymorphisms (SNPs) by ranking corresponding p-values, rather than by applying the same p-value threshold to each SNP. For such a study, we define the detection probability (DP) for a specific disease-associated SNP as the probability that the SNP will be “T-selected”, namely have one of the top T largest chi-square values (or smallest p-values) for trend tests of association. The corresponding proportion positive (PP) is the fraction of selected SNPs that are true disease-associated SNPs. We study DP and PP analytically and via simulations, both for fixed and for random effects models of genetic risk. DP increases with genetic effect size and case-control sample size, and decreases with the number of non-disease-associated SNPs, mainly through the ratio of T to N, the total number of SNPs. We show that DP increases very slowly with T, and the increment in DP per unit increase in T declines rapidly with T. DP is also diminished if the number of true disease SNPs exceeds T. For a genetic odds ratio per minor disease allele of 1.2 or less, even a CCGWAS with 1000 cases and 1000 controls requires T to be impractically large to achieve an acceptable DP, leading to PP values so low as to make the study futile and misleading. Extensions of these methods show that multi-stage designs have appreciably lower DP than a one-stage design with the same number of cases and controls if the proportion of cases and controls in the first stage of the multistage design is less than 25%.

Return to top


Seminar Speakers
Fall Term 2007

Jin Wu, September 20, 2007
Heejung Bang, September 27, 2007
Lu Tian, October 11, 2007
Jason Connor, October 18, 2007
Dulal K. Bhaumik, November 1, 2007

Andre Rogatko , December 13, 2007


SEMINAR

DATE: Thursday, September 20, 2007
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Jing Wu, Assistant Professor of Statistics, Department of Statistics, Purdue University
TOPIC: COMPUTATION-BASED DISCOVERY OF CIS REGULATORY MODULES BY HIDDEN MARKOV MODEL

A key component in genome sequence analysis is the identification of regions of the genome that contain regulatory information. In higher eukaryotes, this information is organized into modular units called cis-regulatory modules. Each module contains multiple binding sites for a specific combination of several transcription factors. In this article, we propose a hidden Markov model (HMM) to identify transcription factor binding sites (TFBSs) and cis-regulatory modules (CRMs). For a given genomic sequence, we first select potential TFBSs from a large database (e.g., TRANSFAC), then construct an HMM where the TFBSs are only counted when they occur within a specialized CRM state. The novel features of the proposed method include that it does not assume a small set of TFBSs for a given gene, on the other hand, the method utilizes information from a large collection of well-characterized TFBSs and therefore is computationally more efficient and robust than the de novo methods. Our approach is applied to three data sets with experimentally evaluated TFBSs. The method shows better specificity and sensitivity than other similar computational tools in identifying CRMs and TFBSs. This is joint work with Dr. Jun Xie in the Department of Statistics at Purdue University.

Return to top


SEMINAR

DATE: Thursday, September 27, 2007
TIME:
3:30p.m.
PLACE:
A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Heejung Bang, Assistant Professor of Public Health, Weill Cornell Medical College, Cornell University
TOPIC: DE-MYSTIFYING MEDICAL COST ESTIMATORS: WHAT WE FOUND AFTER 10 YEARS

In clinical trials comparing different treatments and observational studies
in health economics and outcomes research, medical costs are frequently
collected and analyzed nowadays. Since Lin et al.'s (1997) first finding in
the problem of applying standard analysis techniques such as sample mean and the Kaplan-Meir estimator to the censored cost data, many new methods have been proposed. In this talk, I will review valid methods for statistical estimation and inference that have been developed for last 10 years and show what Zhao, Bang, Wang and Pfeifer (2007) recently discovered, analytic relationships among several widely adopted medical cost estimators that are seemingly different.

Return to top


SEMINAR

DATE: Thursday, October 11, 2007
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: Lu Tian, Assistant Professor, Department of Preventive Medicine, Northwestern University
TOPIC: MODEL EVALUATION BASED ON THE SAMPLING DISTRIBUTION OF ESTIMATED ABSOLUTE PREDICTION ERROR

The construction of a reliable, practically useful prediction rule for future responses is heavily dependent on the ``adequacy" of the fitted regression model. In this article, we consider the absolute prediction error, the expected value of the absolute difference between the future and predicted responses, as the model evaluation criterion. This prediction error is easier to interpret than the average squared error and is equivalent to the mis-classification error for a binary outcome. We show that the prediction error can be consistently estimated via the re-substitution and cross validation methods even when the fitted model is not correctly specified.

Furthermore, we show that the resulting estimators are asymptotically normal. When the prediction rule is ``unsmooth", the variance of the above normal distribution can be estimated well with a perturbation-resampling method. With real examples and an extensive simulation study, we demonstrate that the interval estimates obtained from the above normal approximation for the prediction errors provide much more information about model adequacy than their point estimate counterparts.

Return to top


SEMINAR

DATE: Thursday, October 18, 2007
TIME:
3:30p.m.
PLACE:
A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Jason Connor, Statistical Scientist, Berry Consultants
TOPIC: ETHICS & EXECUTION OF ADAPTIVE BAYESIAN CLINICAL TRIALS A UTERINE CANCER CASE STUDY

I describe a brief history of adaptive designs and then provide a case study in uterine cancer. This includes the ethics of adaptive randomization and statistical benefits of the Bayesian paradigm. In the case study I illustrate components of a trial we may select to adapt during the trial and show ways to present the novel designs to clinicians, IRBs, and regulatory agencies. I illustrate how adaptive designs often times provide shorter, less expensive trials in which a greater proportion of patients receive the most efficacious treatment.

Return to top


SEMINAR

DATE: Thursday, November 1, 2007
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER:
Dulal K. Bhaumik, Professor, Department of Psychiatry, Division of Epidemiology and Biostatistics, University of Illinois at Chicago
TOPIC: SAMPLE SIZE DETERMINATION FOR HIERARCHICAL LONGITUDINAL DESIGNS WITH DIFFERENTIAL ATTRITION RATES

We consider the problem of sample size determination for three-level
mixed-effects linear regression models for the analysis of clustered
longitudinal data. Three-level designs are used in many areas, but in
particular, multi-center randomized longitudinal clinical trials in medical
or health-related research. In this case, level 1 represents measurement
occasion, level 2 represents subject, and level 3 represents center. The
model we consider involves random effects of the time trends at both the
subject level and the center level. In the most common case, we have two random effects (constant and a single trend), at both subject and center levels.

The approach presented here is general with respect to sampling proportions, number of groups, and attrition rates over time. We derive sample size requirements (i.e., power characteristics) for a test of treatment-by-time interaction(s) for designs based on either subject-level or cluster-level randomization. The general methodology is illustrated using two characteristic examples.

Return to top


SEMINAR

DATE: Thursday, December 13, 2007
TIME: 3:30p.m.
PLACE: A-115 Crabtree Hall, Graduate School of Public Health
SPEAKER: André Rogatko, Associate Director for Biostatistics Research and Informatics, Winship Cancer Institute, Professor, Department of Biostatistics, Rollins School of Public Health, Professor, Department of Hematology and Oncology, School of Medicine, Emory University
TOPIC: INDIVIDUALIZED PATIENT DOSING IN CANCER CLINICAL TRIALS

We will discuss EWOC (Escalation with Over Dose Control), the first statistical method to directly incorporate formal safety constraints into the design of cancer phase I trials. The method controls the frequency of overdosing by selecting dose levels for use in the trial so that the predicted proportion of patients administered a dose exceeding the MTD(Maximum Tolerated Dose) is equal to a specified upper bound. We will also discuss an extension of EWOC that permits the utilization of information concerning individual patient differences in susceptibility to treatment. This is the first method described to design cancer clinical trials that not only guides dose escalation but also permits personalization of the dose level for each specific patient. The method adjusts doses according to patient-specific characteristics and allows the dose to be escalated as quickly as possible while safeguarding against overdosing. The extension of EWOC to covariate utilization was implemented in five FDA approved phase I studies that will be discussed. A new paradigm for drug development based on individual dosing will be proposed.

© 2001-2007
Dept. of Biostatistics, University of Pittsburgh

Program Contact:
Registrar, biostat@pitt.edu

Webmaster:
Susan Grasky, BSIS


Home | Graduate School of Public Health Home | Univ. of Pittsburgh Home | Top of Page |
Overview | Faculty | Faculty Position(s) | Administrative Staff | Academic Programs |
Requirements | Frequent Questions | Course Offerings | Seminars | Admission Procedures | Financial Aid |
Statistical Genetics | Student Information | Alumni | Consulting Service |
Active Research | Funded Projects | Faculty Publications | Computing Resource


Department of Biostatistics, 130 Desoto Street, 311 Parran Hall,
Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261
Phone: (412) 624-3022 Fax: (412) 624-2183

Revised on February 22, 2008