Class NaiveBayesMultinomial

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Cloneable, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler
    Direct Known Subclasses:
    NaiveBayesMultinomialUpdateable

    public class NaiveBayesMultinomial
    extends Classifier
    implements WeightedInstancesHandler, TechnicalInformationHandler
    Class for building and using a multinomial Naive Bayes classifier. For more information see,

    Andrew Mccallum, Kamal Nigam: A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI-98 Workshop on 'Learning for Text Categorization', 1998.

    The core equation for this classifier:

    P[Ci|D] = (P[D|Ci] x P[Ci]) / P[D] (Bayes rule)

    where Ci is class i and D is a document.

    BibTeX:

     @inproceedings{Mccallum1998,
        author = {Andrew Mccallum and Kamal Nigam},
        booktitle = {AAAI-98 Workshop on 'Learning for Text Categorization'},
        title = {A Comparison of Event Models for Naive Bayes Text Classification},
        year = {1998}
     }
     

    Valid options are:

     -D
      If set, classifier is run in debug mode and
      may output additional info to the console
    Version:
    $Revision: 11303 $
    Author:
    Andrew Golightly (acg4@cs.waikato.ac.nz), Bernhard Pfahringer (bernhard@cs.waikato.ac.nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • NaiveBayesMultinomial

        public NaiveBayesMultinomial()
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this classifier
        Returns:
        a description of the classifier suitable for displaying in the explorer/experimenter gui
      • getTechnicalInformation

        public TechnicalInformation getTechnicalInformation()
        Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
        Specified by:
        getTechnicalInformation in interface TechnicalInformationHandler
        Returns:
        the technical information about this class
      • buildClassifier

        public void buildClassifier​(Instances instances)
                             throws java.lang.Exception
        Generates the classifier.
        Specified by:
        buildClassifier in class Classifier
        Parameters:
        instances - set of instances serving as training data
        Throws:
        java.lang.Exception - if the classifier has not been generated successfully
      • distributionForInstance

        public double[] distributionForInstance​(Instance instance)
                                         throws java.lang.Exception
        Calculates the class membership probabilities for the given test instance.
        Overrides:
        distributionForInstance in class Classifier
        Parameters:
        instance - the instance to be classified
        Returns:
        predicted class probability distribution
        Throws:
        java.lang.Exception - if there is a problem generating the prediction
      • lnFactorial

        public double lnFactorial​(int n)
        Fast computation of ln(n!) for non-negative ints negative ints are passed on to the general gamma-function based version in weka.core.SpecialFunctions if the current n value is higher than any previous one, the cache is extended and filled to cover it the common case is reduced to a simple array lookup
        Parameters:
        n - the integer
        Returns:
        ln(n!)
      • toString

        public java.lang.String toString()
        Returns a string representation of the classifier.
        Overrides:
        toString in class java.lang.Object
        Returns:
        a string representation of the classifier
      • main

        public static void main​(java.lang.String[] argv)
        Main method for testing this class.
        Parameters:
        argv - the options