In this paper we examine rigorously the evidence for correlations among data size, transfer rate, and duration in Internet flows. We emphasize various statistical approaches for studying correlations, including computing Pearson's correlation coefficient and using the extremal dependence analysis (EDA) method. We apply these methods to three large data sets of packet traces from a diverse set of networks. Our major results show that correlation between size and duration is much weaker than one might expect.
In this article we proposed a data-driven method of generalized adaptive ridge (gar) for an automatic yet adaptive regression shrinkage and selection. We show that, in theory, gar can be equivalent to adaptive lasso, adaptive ridge regression and adaptive elastic net under appropriate conditions. Specifically, if the regression parameters truly enjoy a sparse representation, gar performs like the most recently proposed adaptive lasso (Zou, 2006), hence, is able to identify relevant predictors consistently.
We consider a general adaptive L2-regularized optimization problem ^ ¯(¸) = arg min¯;¤ `(y; ¯)+¸¯T¤W¯, where ` is a loss function, ¤ and W are two diagonal matrices. We show that with appropriate choice of ¸ and ¤, if ` is differentiable, then the above adaptive L2 penalty term is equivalent to adaptive L1 penalty, adaptive L2 penalty, and combined adaptive L1 and L2 penalty. Therefore, this method is a data-driven method, which automatically choose a penalty among the three penalty terms.
A model for cell adhesion mediated by dimeric binding was developed in a companion paper. To test the model, we used a micropipette adhesion frequency assay to measure dimeric E-selectin-Ig or monomeric soluble E-selectin (sE-selectin) interacting with dimeric P-selectin glycoprotein ligand 1 (PSGL-1) or monomeric soluble PSGL-1 (sPSGL-1). Higher adhesions were mediated by E-selectin-Ig interacting with PSGL-1 than sPSGL-1, whereas similar adhesions were mediated by sE-selectin interacting with PSGL-1 or sPSGL-1.
Functional magnetic resonance imaging (fMRI) is an important tool for scientists studying brain function. FMRI data are complex in nature: they are massive in size and a low signal-to-noise ratio makes the elimination of some noise prior to model fitting desirable for improved identification of true brain activity. We propose two methods of reducing this noise: generalized indicator functional analysis and a hidden Markov model.
The models used in small-area inference often involve unobservable random effects. While this can significantly improve the adaptivity and flexibility of a model, it also increases the variability of both point and interval estimators. If we could test for the existence of the random effects, and if the test were to show that they were unlikely to be present, then we would arguably not need to incorporate them into the model, and thus could significantly improve the precision of the methodology. In this paper we suggest an approach of this type.
This paper concerns wavelet regression using a block thresholding procedure. Block thresholding methods utilize neighboring wavelet coefficients information to increase estimation accuracy. We propose to construct a data-driven block thresholding procedure using the Smoothly Clipped Absolute Deviation (SCAD) penalty. A simulation study demonstrates competitive finite sample performance of the proposed estimator compared to existing methods. We also show that the proposed estimator achieves optimal convergence rates in Besov spaces.
In high dimensional regression problems regularization methods have been a popular choice to address variable selection and multicollinearity. In this paper we study bridge regression that adaptively selects the penalty order from data and produces flexible solutions in various settings. We implement bridge regression based on the local linear and quadratic approximations to circumvent the nonconvex optimization problem.
We consider the problem of obtaining locally D-optimal designs for factorial experiments with qualitative factors at two levels each with binary response. Our focus is primarily on the 22 experiment. In this paper, we derive analytic results for special cases and indicate how to handle the general case. The performance of the uniform design in examined and we show that this design is highly efficient in general. For the general 2k case we show that the uniform design has a maximin property.
We consider an experiment with two qualitative factors at 2 levels each and a binary response, that follows a generalized linear model. In Mandal, Yang and Majumdar (2010) we obtained basic results and characterizations of locally D-optimal designs for special cases. As locally optimal designs depend on the assumed parameter values, a critical issue is the sensitivity of the design to misspeciffication of these values. In this paper we study the sensitivity theoretically and by simulation, and show that the optimal designs are quite robust.