common Package
common
Package
This package handles common functionalities that can come in handy for differnt types of estimators. It includes test problems, a kernel density estimation handle, and some other utility functions.
kde
Module
- class sde_calibration.common.kde.KDE(h, X, Y, kernel_type='gauss')
Bases:
object
Class for estimating probability densities using kernel density estimation.
- Parameters
h (np.ndarray) – Kernel bandwidth.
X (np.ndarray) – Training data (NxD).
Y (np.ndarray) – Training labels (Nx1). They are used for regression.
kernel_type (str, optional) –
String describing the kernel function that should be used. Currently implemented options are:
gauss (Gaussian kernel)
parzen (Parzen window)
|default|
'gauss'
- Return type
None
- Raises
ValueError – if the inputs X and Y have a different number of entries.
- cross_validation_error(p, h=None)
Computes the leave-one-out cross-validation least-squares metric for local polynomial regression.
- Parameters
p (int) – Order of the local polynomial estimator.
h (np.ndarray, optional) –
Array of bandwidths that should b used for cross-validation (mx1). If h is None then the bandwidth stored in the object is used.
|default|
None
- Returns
Cross-validation error metric for each bandwidth (mx1).
- Return type
np.ndarray
- diffusion_estimation(x, p=0, uncertainty=False, show_progress=False, display=<built-in function print>)
Regression estimator for the diffusion term of a diffusion process. The training data X is interpreted as time series data from the process.
- Parameters
x (np.ndarray) – Points for evaluation of the diffusion function (nx1).
p (int, optional) –
Degree for local polynomial estimator. The default means that Nadaraya-Watson regression is performed.
uncertainty (bool, optional) – Determines if bias and variance should be estimated. |default|
False
show_progress (bool, optional) – Determines whether the progress should be displayed. |default|
False
display (Callable, optional) – Function to be executed in order to display the progress |default|
<built-in function print>
- Returns
Estimation of the diffusion term at the points specified by x (nx1). Estimated bias at points specified by x (nx1). If uncertainty flag is not set None will be returned. Estimated variance at points specified by x (nx1). If uncertainty flag is not set None will be returned.
- Return type
tuple (np.ndarray, np.ndarray, np.ndarray)
- drift_estimation(x, p=0, uncertainty=False, show_progress=False, display=<built-in function print>)
Regression estimator for the drift term of a diffusion process. The training data X is interpreted as time series data from the process.
- Parameters
x (np.ndarray) – Points for evaluation of the drift function (nx1).
p (int, optional) –
Degree for local polynomial estimator. The default means that Nadaraya-Watson regression is performed.
uncertainty (bool, optional) – Determines if bias and variance should be estimated. |default|
False
show_progress (bool, optional) – Determines whether the progress should be displayed. |default|
False
display (Callable, optional) – Function to be executed in order to display the progress |default|
<built-in function print>
- Returns
Estimation of the drift term at the points specified by x (nx1). Estimated bias at points specified by x (nx1). If uncertainty flag is not set None will be returned. Estimated variance at points specified by x (nx1). If uncertainty flag is not set None will be returned.
- Return type
tuple (np.ndarray, np.ndarray, np.ndarray)
- get_bandwidth()
Returns the current bandwidth stored in an instance.
- Returns
Bandwidth (scalar).
- Return type
float
- get_probability(x)
Estimates the pdf at the points specified
- Parameters
x (np.ndarray) – Points for evaluation (nxD).
- Returns
Probability density at the points specified by x (nx1).
- Return type
np.ndarray
- set_bandwidth(h)
Sets a new bandwidth for further computations.
- Parameters
h (float) – New bandwidth (scalar).
- Return type
None
test_problems
Module
- class sde_calibration.common.test_problems.Problems(problem_type='OU')
Bases:
object
This class handles some common test problem types. It allows to easily use some benchmark problems. Models that are included are:
Ornstein-Uhlenbeck (OU) process (1D and 2D)
Cox-Ingersoll-Ross (CIR) process
Hyperbolic process
Modified CIR process
Double well process
Black-Scholes process
Note that all processes – except the Black-Scholes process – are setup in a way s.t. there exists an invariant distribution.
- Parameters
problem_type (str, optional) –
The process that should be setup. Possible options are:
OU
OU_2D
CIR
hyperbolic
modified_CIR
double_well
- black_scholes
|default|
'OU'
- Return type
None
- get_density(x)
Evaluates the invariant density function of a process given the points of evaluation.
- Parameters
x (np.ndarray) – Array of the points at which the density should be evaluated.
- Returns
The density function evaluated at the points \(x\).
- Return type
np.ndarray
- Raises
NotImplementedError – If the invariant density does not exist or no analytic expression exists.
- get_setup()
Returns the setup of the process that the instance is initalized with.
- Returns
The dictionary of parameters describing the process, the drift function, and the diffusion function
- Return type
tuple (dict, Callable, Callable)
See also
The notation of the parameters in the returned dictionary is taken from Simulation and Inference for Stochastic Differential Equations.
- get_transition_density(y, x, dt)
Evaluates the transition density function of a process given the points of evaluation.
- Parameters
y (np.ndarray) – Array of points at which the density should be evaluated.
x (np.ndarray) – Array of the points that are conditioned on.
dt (float) – Difference in time between the states \(x\) and \(y\).
- Returns
The transition density function evaluated at the points \(y\) given the points \(x\) and the time step \(\Delta t\) between the states.
- Return type
np.ndarray
- Raises
NotImplementedError – If no analytic expression for the transition density exists.
utils
Module
- class sde_calibration.common.utils.Preprocessor(X, y=None, batch_size=1024, validation_size=0.2, input_scaling=None, output_scaling=None, input_columns=None, output_columns=None)
Bases:
object
Class for preprocessing a dataset. This includes scaling the data as well as batching, caching, and prefetching.
- Parameters
X (np.ndarray) – Array of predictors.
y (np.ndarray, optional) – Array of responses. |default|
None
batch_size (int, optional) – Batch size that should be used to perform training on minibatches. |default|
1024
validation_size (float, optional) –
Portion of the provided dataset that should be used for validation.
|default|
0.2
input_scaling (str, optional) –
Gives the type of scaling that should be used for the predictors. Possible options are:
None
minmax
standard
robust
If None is chosen, no scaling of the data is performed.
|default|
None
output_scaling (str, optional) –
Gives the type of scaling that should be used for the response variables. Possible options are:
None
minmax
standard
robust
If None is chosen, no scaling of the data is performed.
|default|
None
input_columns (list, optional) –
If not all of the predictor variables should be transformed this can be specified by setting the columns via his parameter. If None is chosen all variables are transformed.
|default|
None
output_columns (list, optional) –
If not all of the response variables should be transformed this can be specified by setting the columns via his parameter. If None is chosen all variables are transformed.
|default|
None
- Return type
None
- Raises
ValueError – If the in- or output scaling type is not valid.
ValueError – If the passed validation size does not lie in the interval [0, 1).
- get_batch_size()
Gives access to the batch size that is used to preprocess the dataset.
- Returns
Internally stored batch size for preprocessing.
- Return type
float
- get_dataset(dataset_type='train')
Gives access to the transformed, but not yet preprocessed datasets.
- Parameters
dataset_type (str, optional) –
Specifies if either the train or the validation dataset should be returned.
|default|
'train'
- Returns
The transformed predictors and responses of the train and validation dataset, respectively. If the validation size is chosen to be zero, then None will be returned in case of the validation dataset.
- Return type
tuple (np.ndarray, np.ndarray)
- Raises
ValueError –
If the parameter dataset_type is not one of the following options:
train
validation
- get_processed_datasets()
Gives access to the train- and validation datasets after they have been preprocessed.
- Returns
Processed train and validation datasets in TensorFlow format.
- Return type
tuple (tf.data.Dataset, tf.data.Dataset)
- inverse_transform_data(data, transform_type='input')
Performs the inverse transformation arbitrary data according to the scaling that is initialized by the given dataset.
- Parameters
data (np.ndarray) – The data that should be transformed by the stored processor. If no scaling is performed, the data will be returned as it is.
transform_type (str, optional) –
Specifies whether the scaling should be performed according to the predictor (input) scaling or the response (output) scaling.
|default|
'input'
- Returns
The inverse transformed data, i.e. if scaled data is passed in the scaling is reversed by this inverse transformation.
- Return type
np.ndarray
- Raises
ValueError –
If the parameter transform_type has not one of the following values:
input
output
- transform_data(data, transform_type='input')
Transforms arbitrary data according to the scaling that is initialized by the given dataset.
- Parameters
data (np.ndarray) – The data that should be transformed by the stored processor. If no scaling is performed, the data will be returned as it is.
transform_type (str, optional) –
Specifies whether the scaling should be performed according to the predictor (input) scaling or the response (output) scaling.
|default|
'input'
- Returns
The scaled data.
- Return type
np.ndarray
- Raises
ValueError –
If the parameter transform_type has not one of the following values:
input
output
- sde_calibration.common.utils.clean_directory(dir_path)
Makes sure that the directory given exists. If not the directory is created. In case that the directory already exists it is cleaned and all subdirectories are removed.
- Parameters
dir_path (str) – String specifiying the path of the directory to be cleaned.
- Return type
None
- sde_calibration.common.utils.setup_logger(name='logging', fname=None, level=10, log_format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
Function that sets up a basic logger to print on the console as well as log into a file provided one is specified.
- Parameters
name (str, optional) – Name of the logger. |default|
'logging'
fname (str, optional) –
A path to the file that should be used for logging. If None is provided the logger does not print any results to a file.
|default|
None
level (int, optional) –
The logging level, i.e. the depth up to which the logger should notify via an output. Possible options are:
logging.NOTSET (0)
logging.DEBUG (10)
logging.INFO (20)
logging.WARNING (30)
logging.ERROR (40)
logging.CRITICAL (50)
|default|
10
log_format (str, optional) –
Format string specifying the format that should be used for logging. For more information on the format, c.f. here.
|default|
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
- Returns
A Logger object which can be used for further logging.
- Return type
logging.Logger
- Raises
ValueError – If the provided log level is not one of the previously given possible options.