Statistical Analysis of Noise Multiplied Data Using Multiple Imputation

US Census Bureau

Thursday, September 13, 2012 - 3:30pm

A statistical analysis of data that have been multiplied by randomly drawn noise variables in order to protect the confidentiality of individual values has recently drawn some attention (Nayak, Sinha, and Zayatz, 2011; Sinha, Nayak, Zayatz, 2012). If the distribution generating the noise variables has low to moderate variance, then noise multiplied data have been shown to yield accurate inferences in several typical parametric models under a formal likelihood based analysis (Klein, Mathew, and Sinha, 2012). However, the likelihood based analysis is generally complicated due to the non-standard and often complex nature of the distribution of the noise perturbed sample even when the parent distribution is simple. This complexity places a burden on data users who must either develop the required statistical methods or implement the methods if already available or have access to specialized software perhaps yet to be developed. In this paper we propose an alternate analysis of noise multiplied data based on multiple imputation. Some advantages of this approach are that (1) the data user can analyze the released data as if it were never perturbed, and (2) the distribution of the noise variables does not need to be disclosed to the data user.

306 Statistics Building