Tales of Multiple Regression: Informative Missingness, Recommender Systems, and R2-D2

Howard Bondell

North Carolina State University

Thu, 03/24/2016 - 3:30pm

In this talk, I discuss two current projects tangentially related under the umbrella of regression.

The first part of the talk investigates informative missingness in the framework of recommender systems. For example, in 2009, Netflix ran a $1M prize competition to improve their algorithm to recommend movies to their viewers. In this setting, we can imagine a potential rating for every object-user pair. For Netflix, the object would be the movie. However, the vast majority of these pairs are missing. The goal of a recommender system is to predict these missing ratings in order to recommend an object that the user is likely to rate highly. A typically overlooked piece is that the combinations are not missing at random. A relationship between user ratings and their viewing history is expected, as human nature dictates the user would seek out and watch movies that they anticipate enjoying. We model this informative missingness, and place the recommender system in a shared-variable regression framework. We show that taking this additional information into account can aid in prediction quality.

The second part of the talk deals with a new class of prior distributions for linear regression, particularly the high dimensional case. In this setting, choice of prior distribution is notoriously difficult. Instead of placing a prior on the coefficients themselves, we place a prior on the regression R-squared. This is then distributed to the regression coefficients by decomposing it via a Dirichlet Distribution. It is more natural to solicit potential priors from a scientist via knowledge of R-Squared values in their previous studies, rather than knowledge on each regression coefficient itself. We call the new prior R2-D2 in light of its R-Squared Dirichlet Decomposition. In addition to its use in prior solicitation, we show that the R2-D2 prior can outperform existing shrinkage priors in the high dimensional case, in both theory and practice. In particular, compared to the state-of-the-art shrinkage priors, it can simultaneously achieve both higher prior concentration at zero, as well as heavier tails. These two properties combine to provide a higher degree of shrinkage on the irrelevant coefficients, along with less bias in estimation of the larger signals.

http://www4.stat.ncsu.edu/~hdbondel/

Support us

We appreciate your financial support. Your gift is important to us and helps support critical opportunities for students and faculty alike, including lectures, travel support, and any number of educational events that augment the classroom experience. Click here to learn more about giving.

Slideshow

Tales of Multiple Regression: Informative Missingness, Recommender Systems, and R2-D2

Support us