Statistics Seminar S15

Date	Speaker	Institution	Title	Abstract
January 22	Georgiy Bobashev	RTI International	myEpi. Epidemiology of one	Recent introduction of technological innovations such as web-based and mobile-based applications provide a novel way to collect and monitor data on risky behaviors within a single individual. For example a number of applications allow one to monitor smoking, drinking alcohol and drug use as well as non-risky behaviors such as exercise, sleep, food consumption. Traditional epidemiology requires that the results should be applicable to some pre-defined population. It often becomes challenging and even unnecessary to define such a population if the focus is on helping a specific individual. I argue that a single individual could be viewed as an entire population of events that describe behavior and health-related outcomes. I will show how traditional statistical methods used in epidemiology (e.g. survival analysis, time series analysis) that are usually applied to populations of humans, could be applicable to a single individual and thus used for self-monitoring and forecasting of epidemic outbreaks within an individual. I will illustrate similarity between the features of traditional epidemiology (e.g. infectious diseases) and studies of within-person population of risky behavior events. I will discuss application of these methods to a number of subject areas.
Jan 29	Qiongxia Song	Mathematical Sciences, UTD	Sparse SVD and Visualization of Variable Interaction Networks	For functional time series with physical dependence, we construct confidence bands for its mean function. The physical dependence is a general dependence frame, and it slightly relaxes the conditions of m-approximable dependence. We estimate functional time series mean functions via spline smoothing technique. Confidence bands have been constructed based on a long-run variance decomposition and a strong approximation, which are satisfied under mild regularity conditions. Simulation experiments provide strong evidence that corroborates the asymptotic theories. Additionally, an application to S&P 500 index data demonstrates a non-constant volatility mean function at a certain significance level. This is a joint work with Ming Chen.
Feb 12	Rajesh Nandy	University of North Texas	Estimating the intrinsic dimensionality of multivariate data in the presence of strong correlated noise	Estimation of the true dimension of multivariate data in the presence of white noise using information theoretic approach is a well-known problem and there are several successful approaches to this problem. However, these methods usually fail when the signal strength is weak and the noise is correlated. For correlated noise, if the noise correlation matrix can be estimated from noise-only data, the correlated noise can be whitened and the conventional approaches may be still applied. In practice, the noise-only data may not be available. In the absence of noise-only data, a novel and robust approach using independent component analysis is presented which can be shown to be far superior to traditional methods for correlated noise. Results will be provided for a comparative study using simulated data.
Mar 26	Patrick T. Brandt	School of Economic, Political and Policy Sciences The University of Texas, Dallas	Forecasting Conflicts: Long and Short Term Predictions Based on Different Training Set Considerations	How are conflict forecasts affected by the choice of the training set? Remarkably, as many forecasts as are currently being produced using such data, there has been little discussion of this issue. A forecast validation is conducted using event data for the Levant. Across several forecasting models, the size of the training set is varied, and rolling and cumulative samples are used. The forecast densities produced are then scored using a continuous rank probability score to determine forecast quality (Brandt, Freeman, Schrodt 2014). The results show that shorter rolling training sets can perform as well as or better than longer training sets that use all of the available data. A Markov-switching Bayesian vector autoregression (MS-BVAR) model outperforms several other forecasting models. The use of these shorter, rolling samples reduces the computational burden of producing forecasts, which is particularly important when using non-linear, computationally intensive methods such as MS-BVARs. It also means that conflict forecasts for other regions may not need long historical datasets that span decades.
Apr 9	Julia Kozlitina	UT Southwestern	A Robust Distribution-Free Test for Genetic Association Studies of Quantitative Traits	In association studies of quantitative traits, the association of each genetic marker with the trait of interest is typically tested using the F-test assuming an additive genetic model. In practice, the true model is rarely known, and specifying an incorrect model can lead to a loss of power. For case-control studies, the maximum of test statistics optimal for additive, dominant, and recessive models has been shown to be robust to model misspecification. The approach has later been extended to quantitative traits. However, the existing procedures assume that the trait is normally distributed and may not maintain correct type-I error rates and can also have reduced power when the assumption of normality is violated. Here, we introduce a maximum (MAX3) test that is based on ranks and is therefore distribution-free. We examine the behavior of the proposed method using a Monte-Carlo simulation with both normal and non-normal data and compare the results to the usual parametric procedures and other nonparametric alternatives. We show that the rank-based maximum test has favorable properties relative to other tests, especially in the case of symmetric distributions with heavy tails. We illustrate the method with data from a real association study of symmetric dimethylarginine (SDMA). This is joint work with Dr. William R. Schucany.
Apr 16	Michael Braun	Southern Methodist University	Transaction Attributes and Customer Valuation	Dynamic customer targeting is a common task for marketers actively managing customer relationships. Such efforts can be guided by insight into the return on investment from marketing interventions, which can be derived as the increase in the present value of a customer’s expected future transactions. Using the popular latent attrition framework, one could estimate this value by manipulating the levels of a set of nonstationary covariates. We propose such a model that incorporates transaction-specific attributes and maintains standard assumptions of unobserved heterogeneity. We demonstrate how firms can approximate an upper bound on the appropriate amount to invest in retaining a customer and demonstrate that this amount depends on customers’ past purchase activity, namely the recency and frequency of past customer purchases. Using data from a B2B service provider as our empirical application, we apply our model to estimate the revenue lost by the service provider when it fails to deliver a customer’s requested level of service. We also show that the lost revenue is larger than the corresponding expected gain from exceeding a customer’s requested level of service. We discuss the implications of our findings for marketers in terms of managing customer relationships.

Natural Sciences and Mathematics

Mathematical Sciences

Statistics Seminar S15