Kernel-based conditional independence (KCI) test and independence test

Kernel-based conditional independence (KCI) test and independence test [1]. To test if x and y are conditionally or unconditionally independent on Z. For unconditional independence tests, Z is set to the empty set.

Usage

from causallearn.utils.cit import CIT
kci_obj = CIT(data, "kci") # construct a CIT instance with data and method name
pValue = kci_obj(X, Y, S)

The above code runs KCI with the default parameters. Or instead if you would like to specify some parameters of KCI, you may do it by e.g.,

kci_obj = CIT(data, "kci", kernelZ='Polynomial', approx=False, est_width='median', ...)

See KCI.py for more details on the parameters options of the KCI tests.

Please be kindly informed that we have refactored the independence tests from functions to classes since the release v0.1.2.8. Speed gain and a more flexible parameters specification are enabled.

For users, you may need to adjust your codes accordingly. Specifically, if you are

  • running a constraint-based algorithm from end to end: then you don’t need to change anything. Old codes are still compatible. For example,

from causallearn.search.ConstraintBased.PC import pc
from causallearn.utils.cit import kci
cg = pc(data, 0.05, kci)
  • explicitly calculating the p-value of a test: then you need to declare the kci_obj and then call it as above, instead of using kci(data, X, Y, condition_set) as before. Note that now causallearn.utils.cit.kci is a string "kci", instead of a function.

Please see CIT.py for more details on the implementation of the (conditional) independent tests.

Parameters

data: numpy.ndarray, shape (n_samples, n_features). Data, where n_samples is the number of samples and n_features is the number of features.

method: string, “kci”.

kwargs:

  • Either for specifying parameters of KCI, including:

    KernelX/Y/Z (condition_set): [‘GaussianKernel’, ‘LinearKernel’, ‘PolynomialKernel’]. (For ‘PolynomialKernel’, the default degree is 2. Currently, users can change it by setting the ‘degree’ of ‘class PolynomialKernel()’.

    est_width: set kernel width for Gaussian kernels.
    • ‘empirical’: set kernel width using empirical rules (default).

    • ‘median’: set kernel width using the median trick.

    polyd: polynomial kernel degrees (default=2).

    kwidthx/y/z: kernel width for data x/y/z (standard deviation sigma).

    and more: aee KCI.py for details.

  • Or for advanced usages of CIT, e.g., cache_path. See Advanced Usages.

Returns

p: the p value.