Inference Using Shape-Restricted Regression Splines

Regression splines are smooth, flexible, and parsimonious nonparametric function estimators. They are known to be sensitive to knot number and placement, but if assumptions such as monotonicity or convexity may be imposed on the regression function, the shape-restricted regression splines are much more robust to knot choices. Monotone regression splines were introduced by Ramsay (1988). In this paper a more numerically efficient computational method is developed, and the method is extended to convex constraints.

TR Number: 
2006-05
Mary C. Meyer

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

A simple derivation of the sampling distribution of the sample mean and the sample variance of a normal random sample

First we provide a simple derivation of the density of a chi-square random variable. Then we provide another simple proof of the statistical independence of the sample mean and the sample variance of a random sample from a normal distribution. The technique proving independence readily extends to the multivariate case.

TR Number: 
2007-02
Gauri Sankar Datta
Key Words: 
Chi-square distribution; Moment generating function; Wishart distribution.

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

Software for Implementing the Sequential Elimination of Level Combinations Algorithm

Genetic algorithms (GAs) are a popular technology to search for an optimum in a large search space. Using new concepts of forbidden array and weighted mutation, Mandal, Wu and Johnson (2006) used elements of GAs to introduce a new global optimization technique called sequential elimination of level combinations (SELC), that efficiently finds optimums. A SAS macro, and Matlab and R functions are developed to implement the SELC algorithm.

TR Number: 
2007-03
Tan Ding, Abhyuday Mandal and Kjell Johnson

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

Quantile Estimation for Discrete Data via Empirical Likelihood

Quantile estimation for discrete distributions has not been well studied, although discrete data are common in practice. Under the assumption that data are drawn from a discrete distribution, we examine the consistency of the maximum empirical likelihood estimator (MELE) of the pth population quantile µp, with the assistance of a jittering method and results for continuous distributions. The MELE and the sample quantile estimator are closely related, and they may or may not be consistent for µp, depending on whether or not the underlying distribution has a plateau at the level of p.

TR Number: 
2007-04
Jien Chen and Nicole Lazar
Key Words: 
Discrete distributions; Jittering; Bootstrap.

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

G-SELC: Optimization by Sequential Elimination of Level Combinations using Genetic Algorithms and Gaussian Processes

Identifying promising compounds from a vast collection of feasible compounds is an important and yet challenging problem in pharmaceutical industry. An efficient solution to this problem will help reduce the expenditure at the early stages of drug discovery. In an attemp to solve this problem, Mandal, Wu and Johnson (2006) proposed SELC algorithm which was motivated by SEL algorithm of Wu, Mao and Ma (1990). However, SELC fails to extract substantial information from the data to guide the search efficiently as this methodology is not based on any statistical modeling of the data.

TR Number: 
2007-05
Abhyuday Mandal, Pritam Ranjan, and C.F. Jeff Wu

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

Multi-objective Optimal Experimental Designs for Event-Related fMRI Studies

The nature of fMRI studies impels the need for multi-objective designs that simultaneously accomplish statistical goals, circumvent psychological confounds, and fulfill customized requirements. Incorporating knowledge about fMRI designs, we propose an efficient algorithm to search for optimal multi-objective designs. This algorithm significantly outperforms previous search algorithms in terms of achieved efficiency, computation time and convergence rate. Furthermore, our design criterion allows fair, consistent design comparisons.

TR Number: 
2007-06
Ming-Hung Kao, Abhyuday Mandal, Nicole Lazar, and John Stufken
Key Words: 
compound design criterion; counterbalancing; design efficiency; discretization interval; fMRI designs; genetic algorithms; normalization.

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

Robust estimation of the Hurst parameter and selection of an onset scaling

We consider the problem of estimating the Hurst parameter for long-range dependent processes using wavelets. Wavelet techniques have shown to effectively exploit the asymptotic linear relationship that forms the basis of constructing an estimator. However, it has been noticed that the commonly adopted standard wavelet estimator is vulnerable to various non-stationary phenomena that increasingly occur in practice and thus leads to unreliable results.

TR Number: 
2007-07
Juhyun Park and Cheolwoo Park
Key Words: 
Hurst parameter, Long-range dependence, Non-stationarities, Robustness, Wavelet spectrum.

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

Do Different Parts of the Brain Have the Same Dependence Structure? A Multiscale Analysis of the Temporal and Spatial Characteristics of Resting fMRI Data

In this paper, we conduct an investigation of the null hypothesis distribution for functional magnetic resonance imaging (fMRI) data using multiscale analysis. Most current approaches to the analysis of fMRI data assume temporal independence, or, at best, simple models for temporal (short term or long term) dependence structure. The spatial structure of fMRI data is commonly assumed to be independent or weakly spatially dependent.

TR Number: 
2007-08
Cheolwoo Park, Nicole Lazar, Jeongyoun Ahn, and Andrew Sornborger
Key Words: 
fMRI, Hurst parameter, Long-range dependence, Principal component analysis, SiZer, Wavelet spectrum.

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

Analysis of Correlations Among Size, Rate, and Duration in Internet Flows

In this paper we examine rigorously the evidence for correlations among data size, transfer rate, and duration in Internet flows. We emphasize various statistical approaches for studying correlations, including computing Pearson's correlation coefficient and using the extremal dependence analysis (EDA) method. We apply these methods to three large data sets of packet traces from a diverse set of networks. Our major results show that correlation between size and duration is much weaker than one might expect.

TR Number: 
2007-09
Cheolwoo Park, J. S. Marron, Felix Hernández-Campos, Kevin Jeffay, and F. Donelson Smith
Key Words: 
network performance, threshold methods, extremal dependence analysis

To request a copy of this report send an email to Richard Worthington and a pdf copy will be sent to you if available.

Pages

Subscribe to Department of Statistics RSS