Class RDG1

  • All Implemented Interfaces:
    java.io.Serializable, OptionHandler, Randomizable, RevisionHandler

    public class RDG1
    extends ClassificationGenerator
    A data generator that produces data randomly by producing a decision list.
    The decision list consists of rules.
    Instances are generated randomly one by one. If decision list fails to classify the current instance, a new rule according to this current instance is generated and added to the decision list.

    The option -V switches on voting, which means that at the end of the generation all instances are reclassified to the class value that is supported by the most rules.

    This data generator can generate 'boolean' attributes (= nominal with the values {true, false}) and numeric attributes. The rules can be 'A' or 'NOT A' for boolean values and 'B < random_value' or 'B >= random_value' for numeric values.

    Valid options are:

     -h
      Prints this help.
     -o <file>
      The name of the output file, otherwise the generated data is
      printed to stdout.
     -r <name>
      The name of the relation.
     -d
      Whether to print debug informations.
     -S
      The seed for random function (default 1)
     -n <num>
      The number of examples to generate (default 100)
     -a <num>
      The number of attributes (default 10).
     -c <num>
      The number of classes (default 2)
     -R <num>
      maximum size for rules (default 10) 
     -M <num>
      minimum size for rules (default 1) 
     -I <num>
      number of irrelevant attributes (default 0)
     -N
      number of numeric attributes (default 0)
     -V
      switch on voting (default is no voting)
    Following an example of a generated dataset:
     %
     % weka.datagenerators.RDG1 -r expl -a 2 -c 3 -n 4 -N 1 -I 0 -M 2 -R 10 -S 2
     %
     relation expl
    
     attribute a0 {false,true}
     attribute a1 numeric
     attribute class {c0,c1,c2}
    
     data
    
     true,0.496823,c0
     false,0.743158,c1
     false,0.408285,c1
     false,0.993687,c2
     %
     % Number of attributes chosen as irrelevant = 0
     %
     % DECISIONLIST (number of rules = 3):
     % RULE 0:   c0 := a1 < 0.986, a0
     % RULE 1:   c1 := a1 < 0.95, not(a0)
     % RULE 2:   c2 := not(a0), a1 >= 0.562
     
    Version:
    $Revision: 5674 $
    Author:
    Gabi Schmidberger (gabi@cs.waikato.ac.nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • RDG1

        public RDG1()
        initializes the generator with default values
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this data generator.
        Returns:
        a description of the data generator suitable for displaying in the explorer/experimenter gui
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a list of options for this object.

        Valid options are:

         -h
          Prints this help.
         -o <file>
          The name of the output file, otherwise the generated data is
          printed to stdout.
         -r <name>
          The name of the relation.
         -d
          Whether to print debug informations.
         -S
          The seed for random function (default 1)
         -n <num>
          The number of examples to generate (default 100)
         -a <num>
          The number of attributes (default 10).
         -c <num>
          The number of classes (default 2)
         -R <num>
          maximum size for rules (default 10) 
         -M <num>
          minimum size for rules (default 1) 
         -I <num>
          number of irrelevant attributes (default 0)
         -N
          number of numeric attributes (default 0)
         -V
          switch on voting (default is no voting)
        Specified by:
        setOptions in interface OptionHandler
        Overrides:
        setOptions in class ClassificationGenerator
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
      • setNumAttributes

        public void setNumAttributes​(int numAttributes)
        Sets the number of attributes the dataset should have.
        Parameters:
        numAttributes - the new number of attributes
      • getNumAttributes

        public int getNumAttributes()
        Gets the number of attributes that should be produced.
        Returns:
        the number of attributes that should be produced
      • numAttributesTipText

        public java.lang.String numAttributesTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setNumClasses

        public void setNumClasses​(int numClasses)
        Sets the number of classes the dataset should have.
        Parameters:
        numClasses - the new number of classes
      • getNumClasses

        public int getNumClasses()
        Gets the number of classes the dataset should have.
        Returns:
        the number of classes the dataset should have
      • numClassesTipText

        public java.lang.String numClassesTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getMaxRuleSize

        public int getMaxRuleSize()
        Gets the maximum number of tests in rules.
        Returns:
        the maximum number of tests allowed in rules
      • setMaxRuleSize

        public void setMaxRuleSize​(int newMaxRuleSize)
        Sets the maximum number of tests in rules.
        Parameters:
        newMaxRuleSize - new maximum number of tests allowed in rules.
      • maxRuleSizeTipText

        public java.lang.String maxRuleSizeTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getMinRuleSize

        public int getMinRuleSize()
        Gets the minimum number of tests in rules.
        Returns:
        the minimum number of tests allowed in rules
      • setMinRuleSize

        public void setMinRuleSize​(int newMinRuleSize)
        Sets the minimum number of tests in rules.
        Parameters:
        newMinRuleSize - new minimum number of test in rules.
      • minRuleSizeTipText

        public java.lang.String minRuleSizeTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getNumIrrelevant

        public int getNumIrrelevant()
        Gets the number of irrelevant attributes.
        Returns:
        the number of irrelevant attributes
      • setNumIrrelevant

        public void setNumIrrelevant​(int newNumIrrelevant)
        Sets the number of irrelevant attributes.
        Parameters:
        newNumIrrelevant - the number of irrelevant attributes.
      • numIrrelevantTipText

        public java.lang.String numIrrelevantTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getNumNumeric

        public int getNumNumeric()
        Gets the number of numerical attributes.
        Returns:
        the number of numerical attributes.
      • setNumNumeric

        public void setNumNumeric​(int newNumNumeric)
        Sets the number of numerical attributes.
        Parameters:
        newNumNumeric - the number of numerical attributes.
      • numNumericTipText

        public java.lang.String numNumericTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getVoteFlag

        public boolean getVoteFlag()
        Gets the vote flag.
        Returns:
        voting flag.
      • setVoteFlag

        public void setVoteFlag​(boolean newVoteFlag)
        Sets the vote flag.
        Parameters:
        newVoteFlag - boolean with the new setting of the vote flag.
      • voteFlagTipText

        public java.lang.String voteFlagTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getSingleModeFlag

        public boolean getSingleModeFlag()
        Gets the single mode flag.
        Specified by:
        getSingleModeFlag in class DataGenerator
        Returns:
        true if methode generateExample can be used.
      • getAttList_Irr

        public boolean[] getAttList_Irr()
        Gets the array that defines which of the attributes are seen to be irrelevant.
        Returns:
        the array that defines the irrelevant attributes
      • setAttList_Irr

        public void setAttList_Irr​(boolean[] newAttList_Irr)
        Sets the array that defines which of the attributes are seen to be irrelevant.
        Parameters:
        newAttList_Irr - array that defines the irrelevant attributes.
      • attList_IrrTipText

        public java.lang.String attList_IrrTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • generateExample

        public Instance generateExample()
                                 throws java.lang.Exception
        Generate an example of the dataset dataset.
        Specified by:
        generateExample in class DataGenerator
        Returns:
        the instance generated
        Throws:
        java.lang.Exception - if format not defined or generating
        examples one by one is not possible, because voting is chosen
      • generateExamples

        public Instances generateExamples()
                                   throws java.lang.Exception
        Generate all examples of the dataset.
        Specified by:
        generateExamples in class DataGenerator
        Returns:
        the instance generated
        Throws:
        java.lang.Exception - if format not defined or generating
        examples one by one is not possible, because voting is chosen
      • generateExamples

        public Instances generateExamples​(int num,
                                          java.util.Random random,
                                          Instances format)
                                   throws java.lang.Exception
        Generate all examples of the dataset.
        Parameters:
        num - the number of examples to generate
        random - the random number generator to use
        format - the dataset format
        Returns:
        the instance generated
        Throws:
        java.lang.Exception - if format not defined or generating
        examples one by one is not possible, because voting is chosen
      • generateStart

        public java.lang.String generateStart()
        Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.
        Specified by:
        generateStart in class DataGenerator
        Returns:
        string contains info about the generated rules
      • generateFinished

        public java.lang.String generateFinished()
                                          throws java.lang.Exception
        Compiles documentation about the data generation. This is the number of irrelevant attributes and the decisionlist with all rules. Considering that the decisionlist might get enhanced until the last instance is generated, this method should be called at the end of the data generation process.
        Specified by:
        generateFinished in class DataGenerator
        Returns:
        string with additional information about generated dataset
        Throws:
        java.lang.Exception - no input structure has been defined
      • getRevision

        public java.lang.String getRevision()
        Returns the revision string.
        Returns:
        the revision
      • main

        public static void main​(java.lang.String[] args)
        Main method for testing this class.
        Parameters:
        args - should contain arguments for the data producer: