deepensemble.models

Models

Base class Model

class deepensemble.models.model.Model(target_labels, type_model, name='model', input_shape=None, output_shape=None)[source]

Base class for models.

Parameters:

target_labels: list or numpy.array

Target labels.

type_model : str, “classifier” by default

Type of model: classifier or regressor.

name : str, “model” by default

Name of model.

input_shape : tuple[]

Number of inputs of the model.

output_shape : tuple[]

Number of output of the model.

Attributes

__input_shape (tuple) Shape of inputs of the model.
__output_shape (tuple) Shape of output of the model.
type_model (str) Type of model: classifier or regressor.
target_labels (numpy.array) Labels of classes.
_params (list[dict[]]) List of model’s parameters.
_cost_function_list (dict) List for saving the cost functions.
_reg_function_list (list) List for saving the regularization functions.
_score_function_list (dict) This is a list of function for compute a score to models, for classifier model is accuracy by default and for regressor model is RMS by default.
_update_functions (list[]) List of functions allow to update the model’s parameters. The first element is the principal update function,
name (str) This model’s name is useful to identify it later.
append_comment(comment)[source]

Set model info.

Parameters:

comment : str

Information for model.

append_cost(fun_cost, name, **kwargs)[source]

Adds an extra item in the cost functions.

Parameters:

fun_cost : theano.function

Function of cost.

name : str

This string identify cost function, is useful for plot metrics.

**kwargs

Extra parameters of cost function.

append_reg(fun_reg, name, **kwargs)[source]

Adds an extra item in the cost functions.

Parameters:

fun_reg : theano.function

Function of regularization.

name : str

This string identify regularization function, is useful for plot metrics.

**kwargs

Extra parameters of regularization function.

append_score(fun_score, name, **kwargs)[source]

Adds an extra item in the score functions.

Parameters:

fun_score : theano.function

Function of score.

name : str

This string identify score function, is useful for plot metrics.

**kwargs

Extra parameters of score function.

append_update(fun_update, name, **kwargs)[source]

Adds an extra update function.

Parameters:

fun_update : theano.function

Function of update parameters of models.

name : str

This string identify regularization function, is useful for plot metrics.

**kwargs

Extra parameters of update function.

batch_eval(data, batch_size=32, train=True, shuffle=False)[source]

Evaluate cost and score in mini batch.

Parameters:

data : dict

Dictionary with ‘input’ and ‘output’ data.

batch_size: int

Size of batch.

train: bool

Flag for knowing if the evaluation of batch is for training or testing.

shuffle : bool

Returns:

numpy.array

Returns evaluation cost and score in mini batch.

compile(fast=True, **kwargs)[source]

Prepare training (compile function of Theano).

Parameters:

fast : bool

Compile model only necessary.

kwargs

Raises:

If exist an inconsistency between output and count classes

copy_kind_of_model(model)[source]

Copy important data from model.

This data is:
  • Input shape.
  • Output Shape.
  • Type of model.
  • Target labels.
Parameters:

model : Model

Source data model.

delete_cost(name='all')[source]

Delete cost function.

Parameters:

name : str

Name cost function.

Returns:

None

delete_reg(name='all')[source]

Delete regularization function.

Parameters:

name : str

Name regularization function.

Returns:

None

delete_score(name='all')[source]

Delete score function.

Parameters:

name : str

Name score function.

Returns:

None

error(_input, _target, prob=True)[source]

Compute the error diversity of model.

Parameters:

_input : theano.tensor.matrix or numpy.array

Input sample.

_target : theano.tensor.matrix or numpy.array

Target sample.

prob : bool

In the case of classifier if is True the output is probability, for False means the output is translated. Is recommended hold True for training because the translate function is non-differentiable.

Returns:

theano.tensor.matrix or numpy.array

Returns error of model diversity.

fit(_input, _target, max_epoch=100, batch_size=32, early_stop=True, valid_size=0.1, no_update_best_parameters=False, improvement_threshold=0.995, minibatch=True, update_sets=True, criterion_update_params='cost', maximization_criterion=False)[source]

Function for training sequential model.

Parameters:

_input : theano.tensor.matrix

Input training samples.

_target : theano.tensor.matrix

Target training samples.

max_epoch : int, 100 by default

Number of epoch for training.

batch_size : int, 32 by default

Size of batch.

early_stop : bool, True by default

Flag for enabled early stop.

valid_size : float

This ratio define size of validation set (percent).

no_update_best_parameters : bool

This flag is used to change or not parameters of model after training.

improvement_threshold : float, 0.995 by default

minibatch : bool

Flag for indicate training with minibatch or not.

update_sets : bool

Flag for update sets or not.

Returns:

numpy.array[float]

Returns training cost for each batch.

get_cost()[source]

Get cost function.

Returns:

theano.tensor.TensorVariable

Returns cost function.

get_costs()[source]

Gets cost function of model.

Returns:

list[]

Returns cost model list that include regularization.

get_dim_input()[source]

Gets input dimension.

Returns:

int

Returns dimension output.

get_dim_output()[source]

Gets output dimension.

Returns:

int

Returns dimension output.

get_fan_in()[source]

Gets number of input.

Returns:

int

Returns number of input.

get_fan_out()[source]

Gets number of output.

Returns:

int

Returns number of output.

get_info()[source]

Gets model info.

Returns:

str

Returns info.

get_input_shape()[source]

Gets input shape.

Returns:

tuple

Returns input shape.

get_labels_costs()[source]

Gets list of cost functions.

Returns:

list[]

Returns a list cost functions.

get_labels_scores()[source]

Gets list of score functions.

Returns:

list[]

Returns a list score functions.

get_name()[source]

Getter name.

Returns:

str

Returns name of model.

get_new_metric()[source]

Get metrics for respective model.

Note

This function is necessary implemented for uses FactoryMetrics.

See also

FactoryMetrics

get_output_shape()[source]

Gets output shape.

Returns:

tuple

Returns output shape.

get_params(only_values=False)[source]

Getter model parameters.

Returns:

theano.shared

Returns model parameters.

get_result_labels()[source]

Gets list with labels of data training.

Returns:

list[]

Returns list of labels of data training.

get_scores()[source]

Gets score function of model.

Returns:

list[]

Returns score model list.

get_target_labels()[source]

Getter target labels.

Returns:

list

Returns one list with target labels of this model.

get_test_cost()[source]

Gets current testing cost.

Returns:

float

Returns testing cost.

get_test_score()[source]

Gets current testing score.

Returns:

float

Returns testing score.

get_train_cost()[source]

Gets current training cost.

Returns:

float

Returns training cost.

get_train_error()[source]

Gets current training error.

Returns:

float

Returns average training error.

get_train_score()[source]

Gets current training score.

Returns:

float

Returns training score.

get_type_model()[source]

Gets type of model.

Returns:

str

Returns type od model (regressor or classifier).

get_update_function(cost, error)[source]

Gets dict for update model parameters.

Parameters:

cost : theano.tensor.TensorVariable

Cost function.

error : theano.tensor.TensorVariable

Error diversity (computed between output model and target).

Returns:

OrderedDict

A dictionary mapping each parameter to its update expression.

is_binary_classification()[source]

Gets True if this model is a binary classifier, False otherwise.

Returns:

bool

Returns True if this model is a binary classifier, False otherwise.

is_classifier()[source]

Asks if the model is a classifier.

Returns:

bool

Return True if the model is a classifier, False otherwise.

is_compiled()[source]

Indicate if the model was compiled.

Returns:

bool

Returns True if the model was compiled, False otherwise.

is_fast_compiled()[source]

Indicate if the model was compiled in fast mode.

Returns:

bool

Returns True if the model was compiled in fast mode, False otherwise.

is_multi_label()[source]

Indicate if this model is a multi-class classifier model.

Returns:

bool

Returns True if the number of classes is great than 2, False in otherwise.

load_params(params)[source]

Load parameters.

Parameters:

params : list[]

List of parameters.

output(_input, prob=True)[source]

Output model

Parameters:

_input : theano.tensor.matrix

Input sample.

prob : bool

In the case of classifier if is True the output is probability, for False means the output is translated. Is recommended hold True for training because the translate function is non-differentiable.

Returns:

theano.tensor.matrix or numpy.array

Raw output of model.

predict(_input)[source]

Compute the diversity of model.

Parameters:

_input : theano.tensor.matrix or numpy.array

Input sample.

Returns:

numpy.array

Return the diversity of model.

prepare_data(_input, _target, valid_size)[source]

Prepare data for training.

Split data in 2 sets: train and validation.

Parameters:

_input : numpy.array

Input sample.

_target : numpy.array

Target sample.

valid_size : float

Ratio of size validation set.

reset()[source]

Reset params

reset_compile()[source]

Reset all functions compiled with theano.

Returns:None
review_is_binary_classifier()[source]

Review this model is binary classifier

review_shape_output()[source]

Review if this model its dimension output is wrong.

Raises:If exist an inconsistency in output.
save_params()[source]

Save parameter of model.

Returns:

list[]

Returns a list with the parameters model.

score(_input, _target)[source]

Gets score diversity.

Parameters:

_input : numpy.array

Input sample.

_target : numpy.array

Target sample.

Returns:

float

Returns score diversity.

set_input_shape(shape)[source]

Set input shape.

Parameters:

shape : tuple

Input shape.

set_name(name)[source]

Setter name.

Returns:None
set_output_shape(shape)[source]

Set output shape.

Parameters:

shape : tuple

Output shape.

set_update(fun_update, name, **kwargs)[source]

Sets a update function.

Parameters:

fun_update : theano.function

Function of update parameters of models.

name : str

This string identify regularization function, is useful for plot metrics.

**kwargs

Extra parameters of update function.

translate_target(_target)[source]

Translate target.

Parameters:

_target : numpy.array

Target sample.

Returns:

numpy.array

Returns the ‘_target’ translated according to target labels.

Ensemble Model

class deepensemble.models.ensemblemodel.EnsembleModel(name='ensemble')[source]

Base class Ensemble Model.

Parameters:

name : str, “ensemble” by default

Ensemble’s name.

Attributes

__combiner (AverageCombiner) The class combiner allows to mix the models outputs.
__list_models_ensemble (list[Model]) List of the ensemble’s models.
__list_cost_ensemble (list[]) List of cost in Ensemble.
_type_training (str) This parameter means what type of training perform.
add_cost_ensemble(fun_cost, name, **kwargs)[source]

Adds cost function for each models in Ensemble.

Parameters:

fun_cost : theano.function

Cost function.

name : str

This string identify cost function, is useful for plot metrics.

kwargs

Other parameters.

append_model(new_model)[source]

Add model to ensemble.

Parameters:

new_model : Model

Model.

Raises:

If the model is the different type of the current list the models, it is generated an error.

compile(fast=True, **kwargs)[source]

Prepare training (compile function of Theano).

Parameters:

fast : bool

Compile model only necessary.

kwargs

exists_wrapper_models()[source]

Determine whether exist a wrapper model in Ensemble.

Returns:

bool

Returns True if exists wrapper models in Ensemble, False otherwise.

fit(_input, _target, **kwargs)[source]

Training Ensemble.

Parameters:

_input : numpy.array

Input sample.

_target : numpy.array

Target sample.

kwargs

Returns:

MetricsBase

Returns metrics got in training.

fit_separate_models(_input, _target, **kwargs)[source]

Training ensemble models each separately.

Parameters:

_input : numpy.array

Input sample.

_target : numpy.array

Target sample.

kwargs

Returns:

MetricsBase

Returns metrics got in training.

get_combiner()[source]

Getter model combiner in Ensemble.

Returns:

ModelCombiner

Returns a models combiner.

get_model_input()[source]

Gets model input.

Returns:

theano.tensor

Returns model input.

get_models()[source]

Getter list of ensemble models.

Returns:

list

Returns list of models.

get_new_metric()[source]

Gets metric for this model, function necessary for FactoryMetrics.

Returns:

EnsembleMetrics

Returns ensemble metrics.

See also

FactoryMetrics

get_num_models()[source]

Get number of the Ensemble’s models

Returns:

int

Returns current number of models in the Ensemble.

is_need_compile_separately()[source]

Determine if it is necessary to compile models separately.

Returns bool

Returns True if it is necessary to compile models separately, False otherwise.
output(_input, prob=True)[source]

Output of ensemble model.

Parameters:

_input : theano.tensor.matrix or numpy.array

Input sample.

prob : bool

In the case of classifier if is True the output is probability, for False means the output is translated. Is recommended hold True for training because the translate function is non-differentiable.

Returns:

theano.tensor.matrix or numpy.array

Returns of combiner the outputs of the different the ensemble’s models.

predict(_input)[source]

Compute the diversity of model.

Parameters:

_input : theano.tensor.matrix or numpy.array

Input sample.

Returns:

numpy.array

Return the diversity of model.

reset()[source]

Reset parameters of the ensemble’s models.

set_combiner(combiner)[source]

Setter combiner.

Parameters:

combiner : ModelCombiner

Object ModelCombiner for combining model outputs in ensemble.

Raises:

ValueError

If the combiner method is not same type (regressor or classifier).

set_type_training(type_training)[source]

Setter type of training

Parameters:

type_training : str

This parameter means what type of training perform.

Returns:

None

update_io()[source]

Update Input Output shared Theano variables

Sequential Model

class deepensemble.models.sequential.Sequential(name, type_model='regressor', target_labels=None)[source]

This model is a sequence of layers where all elements is interconnected.

Parameters:

name: str

Name of model.

type_model: str

Type of MLP model: classifier or regressor.

Attributes

__layers (list) List of layers.
add_layer(new_layer)[source]

Adds new layer.

Parameters:

new_layer : Layer

New layer.

get_layers()[source]

Get list of layers.

Returns:

list

Returns a list of layers of this model.

get_new_metric()[source]

Get metric of respective model.

Returns:

BaseMetrics

Returns a metric that will depend on type of model.

output(_input, prob=True)[source]

Output of sequential model.

Parameters:

_input: theano.tensor.matrix or numpy.array

Input sample.

prob : bool

In the case of classifier if is True the output is probability, for False means the output is translated. Is recommended hold True for training because the translate function is non-differentiable.

Returns:

theano.tensor.matrix or numpy.array

Returns the output sequential model.

reset()[source]

Reset parameters

Wrapper Model

class deepensemble.models.wrapper.Wrapper(model, name, input_shape=None, output_shape=None, type_model='regressor', target_labels=None)[source]

Wrapper Model class.

This model is a wrapper for model from sklearn (scikit).

Attributes

__model Model from sklearn.
__clf Training model.
compile(fast=True, **kwargs)[source]

Update intern parameters.

fit(_input, _target, nfolds=20, max_epoch=300, batch_size=32, early_stop=True, valid_size=0.2, no_update_best_parameters=False, improvement_threshold=0.995, minibatch=True, update_sets=True)[source]

Training model.

get_new_metric()[source]

Get metrics for respective model.

See also

FactoryMetrics

output(_input, prob=True)[source]

Output model.

predict(_input)[source]

Compute the diversity of model.

Parameters:

_input : theano.tensor.matrix or numpy.array

Input sample.

Returns:

numpy.array

Return the diversity of model.