Yin and Cook (2002) recently introduced a new dimension reduction method for regression called Covk. Here we develop the asymptotic distribution of the Covk test statistic under weak assumptions. This serves as an analytic counterpart to the permutation test suggested by Yin and Cook (2002).
In the context of discrete data, a sequential fixed width confidence interval for an unknown parameter is constructed using a minimum Hellinger distance estimator as the center of the interval. It is shown that the sequential procedure is asymptotically consistent and efficient. These results, in addition to being exactly same as those obtained by Yu (1989) using a maximum likelihood estimator, offer an alternative which has several in-built robustness properties.
Instances where it is necessary to manually count a large number N8 of items, significant time and energy could be saved by weighing the items. When the weights are random with unknown mean and variance, one procedure is to take a small sample of n items, weigh them and add more items until the total weight reaches N8 times the average weight of the n items. This procedure yields a batch of Nn items.
Unbiased tests for the constant versus monotone regression function, as well as the linear versus convex regression function, are known to have null distributions equal to those of mixtures of beta random variables. Both monotone and convex regression estimators exhibit "spiking" at the endpoints of the data range, where the estimator is inconsistent. Consistent estimators for both shape-restricted alternatives are proposed, for which the test statistic using the consistent estimator has again the form of a mixture of beta densities.
An unbiased test for the appropriateness for the simple linear regression model is presented. The null hypothesis is that the underlying regression function is indeed a line, and the alternative is that it is convex. The exact distribution for a likelihood ration test statistics is that of a mixture of beta random variables, with the mixing distribution calculated from relative volumes of polyhedral convex cones determined by the convex shape restriction.
Motivation: With the advent of microarray chip technology, large data sets are emerging containing the simultaneous expression levels of thousands of genes at various time points during a biological process. Biologists are attempting to group genes based on the temporal pattern of their expression levels. While the use of hierarchical clustering (UPGMA) with correlation "distance" has been the most common in the microarray studies, there are many more choices of clustering algorithms in pattern recognition and statistics literature.
Knowledge of the number of causative loci is necessary to estimate the power of mapping studies of complex diseases. IN this paper we re-examine theory developed by Risch (1990a) and its implications for estimating the number L of causative loci affection a complex inherited disease. We first show that methods based on Risch's analysis can produce estimates of L that are inconsistent with the observed population prevalence of the disease.
With recent advances in molecular genetics, it is likely that releases of genetically modified organisms will be used for a variety of purposes. In many cases, such systems would utilize organisms that have been modified on multiple genetic Ioci. Predicting the effect of such releases will require an understanding of the transient dynamics in the system. However, theoretical understanding of transient dynamics in multilocus systems is limited, particularly for early generations when gametic disequilibrium is still high.
A time series model combining a first-order periodic autoregressive structure with classical Box-Jenkins seasonality is introduced. Periodic stationarity conditions for the model are established and its autocovariance function is derived. The limit distribution of least squares estimates of the model parameters are obtained.
Nonlinear mixed-effects models have become important tools for growth and yield modeling in forestry. To date, applications have concentrated on modeling single growth variables such as tree height or bole volume. Here, we propose multivariate multilevel nonlinear mixed effects models for describing several plot-level timber quantity characteristics simultaneously. We describe how such models can be used to produce future predictions of timber volume (yield).