common Package

This package handles common functionalities that can come in handy for differnt types of estimators. It includes test problems, a kernel density estimation handle, and some other utility functions.

kde Module

class sde_calibration.common.kde.KDE(h, X, Y, kernel_type='gauss')

Bases: object

Class for estimating probability densities using kernel density estimation.

  • h (np.ndarray) – Kernel bandwidth.

  • X (np.ndarray) – Training data (NxD).

  • Y (np.ndarray) – Training labels (Nx1). They are used for regression.

  • kernel_type (str, optional) –

    String describing the kernel function that should be used. Currently implemented options are:

    • gauss (Gaussian kernel)

    • parzen (Parzen window)

      |default| 'gauss'

Return type



ValueError – if the inputs X and Y have a different number of entries.

cross_validation_error(p, h=None)

Computes the leave-one-out cross-validation least-squares metric for local polynomial regression.

  • p (int) – Order of the local polynomial estimator.

  • h (np.ndarray, optional) –

    Array of bandwidths that should b used for cross-validation (mx1). If h is None then the bandwidth stored in the object is used.

    |default| None


Cross-validation error metric for each bandwidth (mx1).

Return type


diffusion_estimation(x, p=0, uncertainty=False, show_progress=False, display=<built-in function print>)

Regression estimator for the diffusion term of a diffusion process. The training data X is interpreted as time series data from the process.

  • x (np.ndarray) – Points for evaluation of the diffusion function (nx1).

  • p (int, optional) –

    Degree for local polynomial estimator. The default means that Nadaraya-Watson regression is performed.

  • uncertainty (bool, optional) – Determines if bias and variance should be estimated. |default| False

  • show_progress (bool, optional) – Determines whether the progress should be displayed. |default| False

  • display (Callable, optional) – Function to be executed in order to display the progress |default| <built-in function print>


Estimation of the diffusion term at the points specified by x (nx1). Estimated bias at points specified by x (nx1). If uncertainty flag is not set None will be returned. Estimated variance at points specified by x (nx1). If uncertainty flag is not set None will be returned.

Return type

tuple (np.ndarray, np.ndarray, np.ndarray)

drift_estimation(x, p=0, uncertainty=False, show_progress=False, display=<built-in function print>)

Regression estimator for the drift term of a diffusion process. The training data X is interpreted as time series data from the process.

  • x (np.ndarray) – Points for evaluation of the drift function (nx1).

  • p (int, optional) –

    Degree for local polynomial estimator. The default means that Nadaraya-Watson regression is performed.

  • uncertainty (bool, optional) – Determines if bias and variance should be estimated. |default| False

  • show_progress (bool, optional) – Determines whether the progress should be displayed. |default| False

  • display (Callable, optional) – Function to be executed in order to display the progress |default| <built-in function print>


Estimation of the drift term at the points specified by x (nx1). Estimated bias at points specified by x (nx1). If uncertainty flag is not set None will be returned. Estimated variance at points specified by x (nx1). If uncertainty flag is not set None will be returned.

Return type

tuple (np.ndarray, np.ndarray, np.ndarray)


Returns the current bandwidth stored in an instance.


Bandwidth (scalar).

Return type



Estimates the pdf at the points specified


x (np.ndarray) – Points for evaluation (nxD).


Probability density at the points specified by x (nx1).

Return type



Sets a new bandwidth for further computations.


h (float) – New bandwidth (scalar).

Return type


set_data(X, Y=None)

Adjust the training data stored in the object.

  • X (np.ndarray) – New training data (NxD).

  • Y (np.ndarray, optional) – New training labels (Nx1). |default| None

Return type



ValueError – if the inputs X and Y have a different number of entries.

test_problems Module

class sde_calibration.common.test_problems.Problems(problem_type='OU')

Bases: object

This class handles some common test problem types. It allows to easily use some benchmark problems. Models that are included are:

  • Ornstein-Uhlenbeck (OU) process (1D and 2D)

  • Cox-Ingersoll-Ross (CIR) process

  • Hyperbolic process

  • Modified CIR process

  • Double well process

  • Black-Scholes process

Note that all processes – except the Black-Scholes process – are setup in a way s.t. there exists an invariant distribution.


problem_type (str, optional) –

The process that should be setup. Possible options are:

  • OU

  • OU_2D

  • CIR

  • hyperbolic

  • modified_CIR

  • double_well

  • black_scholes

    |default| 'OU'

Return type



Evaluates the invariant density function of a process given the points of evaluation.


x (np.ndarray) – Array of the points at which the density should be evaluated.


The density function evaluated at the points \(x\).

Return type



NotImplementedError – If the invariant density does not exist or no analytic expression exists.


Returns the setup of the process that the instance is initalized with.


The dictionary of parameters describing the process, the drift function, and the diffusion function

Return type

tuple (dict, Callable, Callable)

See also

The notation of the parameters in the returned dictionary is taken from Simulation and Inference for Stochastic Differential Equations.

get_transition_density(y, x, dt)

Evaluates the transition density function of a process given the points of evaluation.

  • y (np.ndarray) – Array of points at which the density should be evaluated.

  • x (np.ndarray) – Array of the points that are conditioned on.

  • dt (float) – Difference in time between the states \(x\) and \(y\).


The transition density function evaluated at the points \(y\) given the points \(x\) and the time step \(\Delta t\) between the states.

Return type



NotImplementedError – If no analytic expression for the transition density exists.

utils Module

class sde_calibration.common.utils.Preprocessor(X, y=None, batch_size=1024, validation_size=0.2, input_scaling=None, output_scaling=None, input_columns=None, output_columns=None)

Bases: object

Class for preprocessing a dataset. This includes scaling the data as well as batching, caching, and prefetching.

  • X (np.ndarray) – Array of predictors.

  • y (np.ndarray, optional) – Array of responses. |default| None

  • batch_size (int, optional) – Batch size that should be used to perform training on minibatches. |default| 1024

  • validation_size (float, optional) –

    Portion of the provided dataset that should be used for validation.

  • input_scaling (str, optional) –

    Gives the type of scaling that should be used for the predictors. Possible options are:

    • None

    • minmax

    • standard

    • robust

    If None is chosen, no scaling of the data is performed.

    |default| None

  • output_scaling (str, optional) –

    Gives the type of scaling that should be used for the response variables. Possible options are:

    • None

    • minmax

    • standard

    • robust

    If None is chosen, no scaling of the data is performed.

    |default| None

  • input_columns (list, optional) –

    If not all of the predictor variables should be transformed this can be specified by setting the columns via his parameter. If None is chosen all variables are transformed.

    |default| None

  • output_columns (list, optional) –

    If not all of the response variables should be transformed this can be specified by setting the columns via his parameter. If None is chosen all variables are transformed.

    |default| None

Return type


  • ValueError – If the in- or output scaling type is not valid.

  • ValueError – If the passed validation size does not lie in the interval [0, 1).


Gives access to the batch size that is used to preprocess the dataset.


Internally stored batch size for preprocessing.

Return type



Gives access to the transformed, but not yet preprocessed datasets.


dataset_type (str, optional) –

Specifies if either the train or the validation dataset should be returned.

|default| 'train'


The transformed predictors and responses of the train and validation dataset, respectively. If the validation size is chosen to be zero, then None will be returned in case of the validation dataset.

Return type

tuple (np.ndarray, np.ndarray)



If the parameter dataset_type is not one of the following options:

  • train

  • validation


Gives access to the train- and validation datasets after they have been preprocessed.


Processed train and validation datasets in TensorFlow format.

Return type

tuple (,

inverse_transform_data(data, transform_type='input')

Performs the inverse transformation arbitrary data according to the scaling that is initialized by the given dataset.

  • data (np.ndarray) – The data that should be transformed by the stored processor. If no scaling is performed, the data will be returned as it is.

  • transform_type (str, optional) –

    Specifies whether the scaling should be performed according to the predictor (input) scaling or the response (output) scaling.

    |default| 'input'


The inverse transformed data, i.e. if scaled data is passed in the scaling is reversed by this inverse transformation.

Return type




If the parameter transform_type has not one of the following values:

  • input

  • output

transform_data(data, transform_type='input')

Transforms arbitrary data according to the scaling that is initialized by the given dataset.

  • data (np.ndarray) – The data that should be transformed by the stored processor. If no scaling is performed, the data will be returned as it is.

  • transform_type (str, optional) –

    Specifies whether the scaling should be performed according to the predictor (input) scaling or the response (output) scaling.

    |default| 'input'


The scaled data.

Return type




If the parameter transform_type has not one of the following values:

  • input

  • output


Makes sure that the directory given exists. If not the directory is created. In case that the directory already exists it is cleaned and all subdirectories are removed.


dir_path (str) – String specifiying the path of the directory to be cleaned.

Return type


sde_calibration.common.utils.setup_logger(name='logging', fname=None, level=10, log_format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

Function that sets up a basic logger to print on the console as well as log into a file provided one is specified.

  • name (str, optional) – Name of the logger. |default| 'logging'

  • fname (str, optional) –

    A path to the file that should be used for logging. If None is provided the logger does not print any results to a file.

    |default| None

  • level (int, optional) –

    The logging level, i.e. the depth up to which the logger should notify via an output. Possible options are:

    • logging.NOTSET (0)

    • logging.DEBUG (10)

    • logging.INFO (20)

    • logging.WARNING (30)

    • logging.ERROR (40)

    • logging.CRITICAL (50)

  • log_format (str, optional) –

    Format string specifying the format that should be used for logging. For more information on the format, c.f. here.

    |default| '%(asctime)s - %(name)s - %(levelname)s - %(message)s'


A Logger object which can be used for further logging.

Return type



ValueError – If the provided log level is not one of the previously given possible options.