bace API

ComplementNB([alpha, weight_normalized]) Complement Naive Bayes classifier
NegationNB([alpha]) Negation Naive Bayes classifier
UniversalSetNB([alpha]) Universal-set Naive Bayes classifier
SelectiveNB([alpha]) Selective Naive Bayes classifier
class bace.ComplementNB(alpha=1.0, weight_normalized=False)[source]

Bases: bace.base.BaseNB

Complement Naive Bayes classifier


Rennie J. D. M., Shih L., Teevan J., Karger D. R. (2003). Tackling the Poor Assumptions of Naive Bayes Text Classifiers

  • alpha (float) – Smoothing parameter
  • weight_normalized (bool, default False) – Enable Weight-normalized Complement Naive Bayes method.

Sum of alpha params


Classes list

Type:array, shape (n_classes,)

number of training samples observed in each class.

Type:array, shape (n_classes,)


>>> from sklearn.datasets import fetch_20newsgroups
>>> from sklearn.feature_extraction.text import CountVectorizer
>>> from bace import ComplementNB
Prepare data
>>> vectorizer = CountVectorizer()
>>> categories = ['alt.atheism', 'talk.religion.misc','', '']
Train set
>>> newsgroups_train = fetch_20newsgroups(subset='train', categories=categories, shuffle=True)
>>> train_vectors = vectorizer.fit_transform(
Test set
>>> newsgroups_test = fetch_20newsgroups(subset='test', categories=categories, shuffle=True)
>>> test_vectors = vectorizer.transform(
>>> clf = ComplementNB()
>>>, train_vectors).accuracy_score(newsgroups_test, test_vectors)
accuracy_score(X, y)

Return acuracy score

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.

accuracy_score – Accuracy on the given test set

Return type:



Log probability of class occurrence


Complement class count, i.e. number of occurrences of all the samples with all the classes except the given class c


Complement class probability, i.e. logprob of occurrence of a sample, which does not belong to the given class c

fit(X, y)

Fit model to given training set

  • X (array-like, shape (n_samples, n_features)) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape (n_samples,)) – Target values.

self – Returns self.

Return type:

Naive Bayes estimator object


Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
partial_fit(X, y, classes=None)

Incremental fit on a batch of samples.

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.
  • classes (array-like, shape = [n_classes], optional (default=None)) – List of all the classes that can possibly appear in the y vector. Must be provided at the first call to partial_fit, can be omitted in subsequent calls.

self – Returns self.

Return type:



Perform classification on an array of test vectors X.

Parameters:X (array-like, shape = [n_samples, n_features]) – Unseen samples vector
Returns:C – Predicted target values for X
Return type:array, shape = [n_samples]

Return log-probability estimates for the test vector X.

Parameters:X (array-like, shape = [n_samples, n_features]) –
Returns:C – Returns the log-probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Return probability estimates for the test vector X. :param X: :type X: array-like, shape = [n_samples, n_features]

Returns:C – Returns the probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:self
class bace.NegationNB(alpha=1.0)[source]

Bases: bace.base.BaseNB

Negation Naive Bayes classifier

Parameters:alpha (float) – Smoothing parameter


Komiya K., Sato N., Fujimoto K., Kotani Y. (2011). Negation Naive Bayes for Categorization of Product Pages on the Web

accuracy_score(X, y)

Return acuracy score

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.

accuracy_score – Accuracy on the given test set

Return type:



Log probability of class occurrence


Complement class count, i.e. number of occurrences of all the samples with all the classes except the given class c


Complement class probability, i.e. logprob of occurrence of a sample, which does not belong to the given class c

fit(X, y)

Fit model to given training set

  • X (array-like, shape (n_samples, n_features)) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape (n_samples,)) – Target values.

self – Returns self.

Return type:

Naive Bayes estimator object


Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
partial_fit(X, y, classes=None)

Incremental fit on a batch of samples.

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.
  • classes (array-like, shape = [n_classes], optional (default=None)) – List of all the classes that can possibly appear in the y vector. Must be provided at the first call to partial_fit, can be omitted in subsequent calls.

self – Returns self.

Return type:



Perform classification on an array of test vectors X.

Parameters:X (array-like, shape = [n_samples, n_features]) – Unseen samples vector
Returns:C – Predicted target values for X
Return type:array, shape = [n_samples]

Return log-probability estimates for the test vector X.

Parameters:X (array-like, shape = [n_samples, n_features]) –
Returns:C – Returns the log-probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Return probability estimates for the test vector X. :param X: :type X: array-like, shape = [n_samples, n_features]

Returns:C – Returns the probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:self
class bace.UniversalSetNB(alpha=1.0)[source]

Bases: bace.base.BaseNB

Universal-set Naive Bayes classifier

Parameters:alpha (float) – Smoothing parameter


Komiya K., Ito Y., Kotani Y. (2013). New Naive Bayes Methods using Data from All Classes

accuracy_score(X, y)

Return acuracy score

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.

accuracy_score – Accuracy on the given test set

Return type:



Log probability of class occurrence


Complement class count, i.e. number of occurrences of all the samples with all the classes except the given class c


Complement class probability, i.e. logprob of occurrence of a sample, which does not belong to the given class c

fit(X, y)[source]

Fit model to given training set

  • X (array-like, shape (n_samples, n_features)) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape (n_samples,)) – Target values.

self – Returns self.

Return type:

Naive Bayes estimator object


Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
partial_fit(X, y, classes=None)[source]

Incremental fit on a batch of samples.

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.
  • classes (array-like, shape = [n_classes], optional (default=None)) – List of all the classes that can possibly appear in the y vector. Must be provided at the first call to partial_fit, can be omitted in subsequent calls.

self – Returns self.

Return type:



Perform classification on an array of test vectors X.

Parameters:X (array-like, shape = [n_samples, n_features]) – Unseen samples vector
Returns:C – Predicted target values for X
Return type:array, shape = [n_samples]

Return log-probability estimates for the test vector X.

Parameters:X (array-like, shape = [n_samples, n_features]) –
Returns:C – Returns the log-probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Return probability estimates for the test vector X. :param X: :type X: array-like, shape = [n_samples, n_features]

Returns:C – Returns the probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:self
class bace.SelectiveNB(alpha=1.0)[source]

Bases: bace.base.BaseNB

Selective Naive Bayes classifier

Parameters:alpha (float) – Smoothing parameter


Komiya K., Ito Y., Kotani Y. (2013). New Naive Bayes Methods using Data from All Classes

accuracy_score(X, y)

Return acuracy score

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.

accuracy_score – Accuracy on the given test set

Return type:



Log probability of class occurrence


Complement class count, i.e. number of occurrences of all the samples with all the classes except the given class c


Complement class probability, i.e. logprob of occurrence of a sample, which does not belong to the given class c

fit(X, y)

Fit model to given training set

  • X (array-like, shape (n_samples, n_features)) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape (n_samples,)) – Target values.

self – Returns self.

Return type:

Naive Bayes estimator object


Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
partial_fit(X, y, classes=None)

Incremental fit on a batch of samples.

  • X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like, shape = [n_samples]) – Target values.
  • classes (array-like, shape = [n_classes], optional (default=None)) – List of all the classes that can possibly appear in the y vector. Must be provided at the first call to partial_fit, can be omitted in subsequent calls.

self – Returns self.

Return type:



Perform classification on an array of test vectors X.

Parameters:X (array-like, shape = [n_samples, n_features]) – Unseen samples vector
Returns:C – Predicted target values for X
Return type:array, shape = [n_samples]

Return log-probability estimates for the test vector X.

Parameters:X (array-like, shape = [n_samples, n_features]) –
Returns:C – Returns the log-probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Return probability estimates for the test vector X. :param X: :type X: array-like, shape = [n_samples, n_features]

Returns:C – Returns the probability of the samples for each class in the model. The columns correspond to the classes in sorted order, as they appear in the attribute classes_.
Return type:array-like, shape = [n_samples, n_classes]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:self
class bace.Benchmark(classifiers, verbose=False)[source]

Bases: sklearn.base.BaseEstimator

scikit-learn like classifiers benchmark

  • classifiers (list of sklearn.base.BaseEstimator) – List of sklearn classifiers
  • verbose (bool) – Print training details
compare(X, y, metrics={'Accuracy': <function accuracy_score>})[source]

Compare predictions of multiple classifiers

  • X (numpy.ndarray) – Features
  • y (numpy.ndarray) – Targets
  • metrics (dict of callable) – List of metric functions
fit(X, y)[source]

Fit several classifiers

  • X (numpy.ndarray) –
  • y (numpy.ndarray) – Labels

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:self
class bace.BenchmarkNaiveBayes[source]

Bases: bace.benchmark.Benchmark

CLASSIFIERS = [MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True), ComplementNB(alpha=1.0, weight_normalized=True), ComplementNB(alpha=1.0, weight_normalized=False), ComplementNB(alpha=1.0, class_prior=None, fit_prior=True, norm=False), ComplementNB(alpha=1.0, class_prior=None, fit_prior=True, norm=True), NegationNB(alpha=1.0), SelectiveNB(alpha=1.0), UniversalSetNB(alpha=1.0)]
compare(X, y, metrics={'Accuracy': <function accuracy_score>})

Compare predictions of multiple classifiers

  • X (numpy.ndarray) – Features
  • y (numpy.ndarray) – Targets
  • metrics (dict of callable) – List of metric functions
fit(X, y)

Fit several classifiers

  • X (numpy.ndarray) –
  • y (numpy.ndarray) – Labels

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:self