% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dUtility.R
\name{IL_correl}
\alias{IL_correl}
\alias{print.il_correl}
\alias{IL_variables}
\alias{print.il_variables}
\title{Additional Information-Loss measures}
\usage{
IL_correl(x, xm)

\method{print}{il_correl}(x, digits = 3, ...)

IL_variables(x, xm)

\method{print}{il_variables}(x, digits = 3, ...)
}
\arguments{
\item{x}{an object coercible to a \code{data.frame} representing the original dataset}

\item{xm}{an object coercible to a \code{data.frame} representing the perturbed, modified dataset}

\item{digits}{number digits used for rounding when displaying results}

\item{...}{additional parameter for print-methods; currently ignored}
}
\value{
the corresponding information-loss measure
}
\description{
Measures \code{\link[=IL_correl]{IL_correl()}} and \code{\link[=IL_variables]{IL_variables()}} were proposed by Andrzej Mlodak and are (theoretically) bounded between \code{0} and \code{1}.
}
\details{
\itemize{
\item \code{IL_correl()}: is a information-loss measure that can be applied to common numerically scaled variables in \code{x} and \code{xm}. It is based
on diagonal entries of inverse correlation matrices in the original and perturbed data.
\item \code{IL_variables()}: for common-variables in \code{x} and \code{xm} the individual distance-functions depend on the class of the variable;
specifically these functions are different for numeric variables, ordered-factors and character/factor variables. The individual distances
are summed up and scaled by \code{n * m} with \code{n} being the number of records and \code{m} being the number of (common) variables.
}

Details can be found in the references below

The implementation of \code{\link[=IL_correl]{IL_correl()}} differs slightly with the original proposition from Mlodak, A. (2020) as
the constant multiplier was changed to \code{1 / sqrt(2)} instead of \code{1/2} for better efficiency and interpretability
of the measure.
}
\examples{
data("Tarragona", package = "sdcMicro")
res1 <- addNoise(obj = Tarragona, variables = colnames(Tarragona), noise = 100)
IL_correl(x = as.data.frame(res1$x), xm = as.data.frame(res1$xm))

res2 <- addNoise(obj = Tarragona, variables = colnames(Tarragona), noise = 25)
IL_correl(x = as.data.frame(res2$x), xm = as.data.frame(res2$xm))

# creating test-inputs
n <- 150
x <- xm <- data.frame(
  v1 = factor(sample(letters[1:5], n, replace = TRUE), levels = letters[1:5]),
  v2 = rnorm(n),
  v3 = runif(3),
  v4 = ordered(sample(LETTERS[1:3], n, replace = TRUE), levels = c("A", "B", "C"))
)
xm$v1[1:5] <- "a"
xm$v2 <- rnorm(n, mean = 5)
xm$v4[1:5] <- "A"
IL_variables(x, xm)
}
\references{
Mlodak, A. (2020). Information loss resulting from statistical disclosure control of output data,
Wiadomosci Statystyczne. The Polish Statistician, 2020, 65(9), 7-27, DOI: 10.5604/01.3001.0014.4121

Mlodak, A. (2019). Using the Complex Measure in an Assessment of the Information Loss Due to the Microdata Disclosure Control,
Przegląd Statystyczny, 2019, 66(1), 7-26,
DOI: 10.5604/01.3001.0013.8285
}
\author{
Bernhard Meindl \href{mailto:bernhard.meindl@statistik.gv.at}{bernhard.meindl@statistik.gv.at}
}
