*Program Head, Cognitive Science BS and Applied Cognition and Neuroscience MS*

PhD, Brown University

Quantitative Models in Cognition

GR 4.814

972-883-2423

**My research interests may be broadly characterized in terms of the development, extension, and understanding of formal mathematical models of perceptual and cognitive processes.** My specific research interests can be conveniently divided into two areas of work: (1) mathematical analysis and design of artificial neural networks, and (2) mathematical models of human language and human text comprehension.

**Mathematical Analysis and Design of Artificial Neural Networks**

The underlying psychological assumptions of most artificial neural network models of cognitive and neural processes are often obscured by how such models are constructed, presented, discussed, and evaluated. A common thread throughout my research program over the past 15 years has been to ''rebuild'' the neural network modeling paradigm so that neural network modeling assumptions are interpretable, theoretically well-grounded, empirically identifiable, and testable. My methodology for approaching this problem draws heavily upon classical engineering mathematics such as nonlinear dynamical systems theory, nonlinear optimization theory, and statistical pattern recognition. Examples of my work in this area include my book entitled *Mathematical Methods for Neural Network Analysis and Design* (MIT Press, 1996), my analysis of the BSB neural net model published in the Journal of Mathematical Psychology (Golden, 1993), and publication of a recent Psychometrika article (Golden, 2003) which describes the recent development of a new statistical test for comparing competing models which may be possibly misspecified or nonnested.

**Mathematical Models of Human Language and Text Comprehension**

During the past decade, I have focused my attention on developing a new confirmatory constrained categorical time-series data analysis methodology for testing specific hypotheses about knowledge digraphs (i.e., a general class of semantic networks) which is called KDC (Knowledge Digraph Contribution) analysis. KDC analysis uses the order in which propositions appear in recall, summarization, question-answering, and other types of free response data to obtain a more revealing picture of the nature of the by-products of human comprehension processes. Golden (1998) provides the best summary of the current version of this statistical methodology. Durbin, Earwood, and Golden (2000) show how a simple probabilistic computational linguistics model based upon hidden Markov models can be trained to automatically and consistently semantically annotate human protocol data in order to support KDC analysis. The mathematical foundations of KDC theory are based largely upon the mathematical tools and techniques from asymptotic statistical theory and nonlinear optimization theory which I have exploited and developed in my investigations of artificial neural network models.

Currently, research in this area is being funded by an Information Technology Research (ITR) Award (in the area of Educational Technology) from the National Science Foundation to develop the ARCADE (Automated Reading Comprehension Assessment and Diagnostic Evaluation) system. The long-term goal of the ARCADE system is to develop a nation-wide web based system where grade school, middle school, and high school student answers to essay questions are automatically semantically analyzed and then used to make suggestions to classroom teachers in order to enhance student learning experiences in the classroom. The project involves research in the areas of: cognitive psychology, computer science, electrical engineering, educational technology, and computational linguistics.

**Overview**

My specific research interests are divided into two complementary areas. First, basic research in understanding statistical machine learning algorithms. Many important mathematical models of human and animal behavior as well as mathematical models in the field of computational neuroscience have been developed using statistical machine learning algorithms. An improved understanding of statistical machine learning algorithms then directly translates into an improved understanding of a large class of mathematical models in the Behavioral and Brain Science field. And second, the empirical application of statistical machine learning algorithms in the fields of mathematical psychology and computational neuroscience as well as other areas including: biomedical data analysis, artificial intelligence, computer science, and control theory. These two lines of research are highly synergistic. The theoretical statistical machine learning research component supports new solutions to mathematical modeling applications, and the applications component supports new directions in theoretical statistical machine learning research.

**Theoretical Research in Statistical Machine Learning**

A recent book chapter published in a Feschrift for the late Halbert L. White discusses an entirely new collection of mathematical tools for determining if a given probability model is an adequate representation of the process which generated the data (Golden et al., 2013). Despite best intentions, however, the possibility of model misspecification (i.e., the presence of flaws in a probability model) is always present. This observation has motivated an additional research strand which is
concerned with the development of methods for robust estimation and inference in the presence of possible model misspecification in the presence of partially observable data (see Golden, 1995, for a review). In addition, many fundamental problems in models of the brain and behavioral sciences as well as machine learning are forced to deal with the presence of only partially observable state variables due to either unavoidable limitations of measurement methodologies and/or lack of knowledge of model structure. My recent research is concerned with the development of a unified mathematical theory for supporting robust estimation and inference in the simultaneous presence of both model misspecification and partially observable state variables. Another on-going research thread which dates back to Golden (2000, 2003) is my long-term interest in model selection criteria for determining which of two competing probability models provides a better representation of the data generating process. For over two decades I have pursued the development of a unified probabilistic framework for interpreting learning machines (Golden, 1988a, 1988b, 1996; Rumelhart et al., 1996). My book *Mathematical Methods for Neural Network Analysis and Design* (1996, MIT Press) provides a useful introduction to my research program in the area of statistical machine learning which includes work in dynamical systems theory, deterministic and stochastic nonlinear optimization theory, deterministic and stochastic control theory, Markov fields, and statistical pattern recognition. Selected list of theoretical statistical machine learning publications.

**Text Comprehension and Memory**

An on-going component of my research program involves investigating human text comprehension and memory. Currently, I am evaluating a constrained multinomial logistic regression time-series modeling methodology for the purpose of analyzing semantic structural relations in human recall, summarization, and question-answering data (Ismaili and Golden, 2008; Golden, 1998) as well as semi-automated methods for the analysis of human free response data (Golden and Ghiasinejad, 2013; Durbin, Earwood, and Golden, 2000).

Visit Text Comprehension Website

Visit Knowledge Digraph Contribution Analysis Website

**Statistical Machine Learning Applications**

My research program additionally involves the application of statistical machine learning methods to a wide range of mathematical modeling problems. These empirical applications are an important supportive component of the theoretical research program component which is concerned with the analysis and design of statistical machine learning algorithms. Empirical applications include: identification of clinical predictors of deep venous thrombosis and pulmonary embolus after severe injury (Brakenridge et al., 2013), document clustering (Dasgupta, Golden, and Ng, 2012), predicting risk of multiple organ failures after severe injury (Brakenridge et al., 2011), evaluating the effects of duty hour limits on resident physician satisfaction (Kashner et al., 2010), automated detection of software bugs (Wong, Shi, Qi, and Golden, 2008), smart antenna blind adaptive CDMA processing (Paik, Golden, Tolak, and Dowling, 2006), smart antenna multiuser interference suppression (Jani, Dowling, and Golden, 2000), automated analog circuit design (Golden, 2000), predicting quality of life in patients with benign prostate hyperplasia or prostate cancer (Krongrad et al., 1997; Michaels et al., 1998), and automated aircraft landing (Schley et al., 1991). Currently, I am an Invited Consultant to the National Panel on Statistics and Analytics for Veterans Health Administration. Selected list of statistical machine learning applications and list of patents.

Kashner, T.M., Hinson, Holland, G.J., Mickey, D.D., Hoffman, K., Lind, L., Johnson, L.D., Chang, B.K., Golden, R.M., and Henley, S.S. (2007). A data accounting system for clinical investigators. Journal of American Medical Informatics Association, 14: 394-396.

Golden, R. M. (2003). Discrepancy risk model selection test theory for comparing possibly misspecified or nonnested models. Psychometrika, 68: 229-249.

Golden, R. M. (1998). Knowledge digraph contribution analysis of protocol data. Discourse Processes, 25,:179-210.