Animal Vegetable or Mineral?

Enter a word and I'll try to guess whether it's an animal, a vegetable, or a mineral.

Enter a word

Training Data Received!

Training data error =(

Was this right?

What is the correct category?

Animal

Vegetable

Mineral

What is this?

This is a very simple demonstration of the power of machine learning. It uses a naive bayesian classifier — a basic machine learning algorithm — to determine whether the word that was entered is an animal, a vegetable, or a mineral.

Why are you showing me this?

One of the first things I do with AI/ML design is reduce the barrier to entry. With so many of the small-to-medium sized businesses that I've talked to, the question about AI and machine learning is where to start.

I use this example to demonstrate machine learning because it's so simple. In less than 100 lines of code, we have a fully working, fully trainable machine learning based classifier! All of the machine learning code that I create is tiny, free, and open source.

How it Works

When you enter a word in the search box, the classifier checkes three files: animals, vegetables, and minerals. The file that has the most entries for that word is the category that is returned. These files contain a list of words that have been entered on this site as well as several libraries of plants and animals that I found on the internet, but it still has a lot to learn!

Why is it called naïve?

Naive Bayesian Classifiers are called naive because they don't know anything when they start, and they're called Bayesian because they were invented by a dude named Bayes. And they're called classifiers because they classify stuff...

Naive Bayesian classifiers start out completely untrained. This classifier could just as easily classify between three groups of pretty much anything. Until it's trained, it doesn't even know the difference between animals, vegetables, and minerals. All it knows is that that they are different.

Your feedback trains the bot

If it knows, or if it has a guess, it will tell you it's answer and ask you if it was correct. If it was right (and you tell it that it was right) it will use reinforcement learning to make stronger connections for that guess. If it was wrong you can re-train it to learn the right answer.

When you enter a word that the bot doesn't know, instead of guessing, it will ask you whether it is an animal, a vegetable, or a mineral. Your feedback trains the bot and the next time you ask it that question (or someone else does) it may have a better guess than it did the last time.

Try it yourself!

Clone or fork the repo

                                
    git clone https://github.com/adunderwood/avm

Install dependencies

                                
    npm init -y
    npm install

Run the tutorial

                                
    npm start

/avm/tutorial.js

Download on GitHub

                                
    // module dependencies
    var dclassify = require('dclassify')

    var prompt = require('prompt-sync')()
    var pluralize = require('pluralize')
    var func = require('./functions.js')

    // Utilities provided by dclassify
    var Classifier = dclassify.Classifier
    var DataSet    = dclassify.DataSet
    var Document   = dclassify.Document

    var dir = "./avm/"

    var userTraining
    while ((userTraining != "q") && (userTraining != "quit")) {

        var animalsFromFile = func.readFile(dir + "animals.txt")
        var vegetablesFromFile = func.readFile(dir + "vegetables.txt")
        var mineralsFromFile = func.readFile(dir + "minerals.txt")

        // create some 'good' test items (name, array of characteristics)
        var itemAnimals = new Document('animalsFromFile', animalsFromFile)
        var itemVegetables = new Document('vegetablesFromFile', vegetablesFromFile)
        var itemMinerals = new Document('mineralsFromFile', mineralsFromFile)
        var itemUnknown = new Document('unknown',[])

        // create a DataSet and add test items to appropriate categories
        // this is 'curated' data for training
        var data = new DataSet()
        data.add('animals', itemAnimals)
        data.add('vegetables', itemVegetables)
        data.add('minerals', itemMinerals)
        data.add('unknown', itemUnknown)

        // create a classifier
        var classifier = new Classifier(options)

        // train the classifier
        classifier.train(data)

        userInput = prompt('Animal, Vegetable, or Mineral? ')
        userInput = pluralize.singular(userInput)

        console.log('Ok. Classifying \"' + func.sentenceCase(userInput) + '\": ')

        // test the classifier on a new test item
        var testThis = []
        testThis.push(userInput.toLowerCase())

        var testDoc = new Document('testDoc', testThis)
        var result1 = classifier.classify(testDoc)

        // report to the user
        console.log(result1)
        console.log(func.sentenceCase(result1.category))

        userTraining = prompt('Was that right? y/n/quit ')

        var firstLetter = userTraining[0]
        switch (firstLetter) {
            case "y":
                func.writeFile(result1.category, userInput, dir)
                break
            case "n":
                var tmpInput = prompt("What is the right category? ")
                func.correctFile(tmpInput, userInput, dir)
                break
            default:
                console.log("No action taken.")
        }
    }

What this code looks like when it's running:

Sample Output