Thursday, April 19, 2012

Letting my mind run: On your Mark(ov)

I was originally going to write a post about how Madoka Magica isn't good like a lot of people, mainly tropers, claim it is (but that instead, it's an entirely different and even better kind of good), or sum up some of my thoughts about the second season of Working!! in a rambling piece about what I like in a slice of life series.  However, I just finished writing my farewell to TV-tropes, which left me (a) much later in the evening than I'd originally banked on, and (b) in a somewhat melancholy mood.  Neither one of those factors are really conductive to my ability to write those posts that I wanted to write.  I also had some thoughts earlier today during work that I want to jot down somewhere...

In other words, fuck those topics.  I'm going to talk about Markov Chains.

A Markov Chain is a sort of probabilistic determination model that can be used for constructing random content that isn't exactly like the inputs of the function, but will strongly resemble it.  (ie, if you put the english dictionary as an input, you will  receive words that aren't in English, but will use the same probability for letter clumps such that they'd look English to anybody who didn't actually speak the language, and you could probably even convince a native speaker that it was just an obscure word that they'd never encountered before).

To make an example, lets generate a Markov chain using "Supercalifragilisticexpialidocious" as the input.  First, we break the word into the component probabilities of each letter.  This looks like the following:
  • aaa (3/35)
  • ccc (3/35)
  • d (1/35)
  • ee (2/35)
  • f (1/35)
  • g (1/35)
  • iiiiiii (7/35)
  • lll (3/35)
  • oo (2/35)
  • pp (2/35)
  • rr (2/35)
  • sss (3/35)
  • t (1/35)
  • uu (2/35)
  • x (1/35) 
  • /EOS/ (1/35)
This gives us the chance of any individual character in a word being any given letter, which we could use to generate a word-like object, but it would most likely end up as an unpronounceable jumble of letters.  Thus, we introduce the concept of Orders.  The list above will give an order-0 Markov Chain: Each new character is created without regard for what else has appeared in a word, meaning that you may obtain blocks of 5 hard consonants right next to each other, or a string of 8 vowels.  A sample from an Order-1 Markov chain would have these rules instead:
  • i -> a (1/7)
  • i -> c (1/7)
  • i -> d (1/7)
  • i -> f (1/7)
  • i -> l (1/7)
  • i -> o (1/7)
  • i -> s (1/7)
This is just for the character 'i', but the same principle applies across all of the letters.  You'll note that there is no /EOS/ (End of String) character in this "i" set.  This is because Supercalifragilisticexpialidocious doesn't end with an "i", so our First-order Markov chain will never place /EOS/ directly after one. If we do this type of chain, every word will begin with an "S" (because that is the first letter) and end with an "s" (because that is the last letter).  There would be a 1/3 chance that the string returned would only consist of "S" (1 instance of /EOS/ following an "s" out of three total instances of "s").  Higher-order Markov Chains would base letters off of the last two letters to appear in the word, or the last three, or so on.


So, I was thinking today about the use of Markov Chains for a learning-by-observing chat program (This is probably not a new idea, but I'm just jotting down some thoughts right now, so some of these might be tried-and-false ideas that others have already proven bad).

  • In this case, we want to use words as the elements rather than characters.
  • I was thinking about indefinite-order chains and worried about the amount of memory it would take to store occurrence data for every possible combination of words (even using class-based techniques that would render no need to save space for the large number of word combinations that will never show up).  I realized that I was going about this the wrong way: Instead, just store the chats themselves, and construct a 2-3-4 tree data dictionary that contains an entry for each word that has been encountered by the program so far, with a linked list of pointers to which chat "sentence" the word appears in.  If you have already generated the words "How are you", you would first obtain the list of sentences that contain the word "you".  Next, obtain the list of sentences that contain the word "are".  Generate a list of sentences that contain both words, and investigate each chat log in this list to see if the two words appear in the order "Are you".  If no such sentences occur, then construct a probability list from the list of words that appear after "You".  Otherwise, search to see if there are sentences that contain the phrase "How are you" using the same method used to find "Are you", and resort to building a list from "Are you" if there are none to be found.
  • The Chain should be two-dimensional: Probabilities would be determined by taking the probability of any given word based on the words already generated for the sentence, and taking the probability of words appearing in the sentence after words from the current question.
  • This means that the chatbot would need to store logs of its own statements so it could see how people respond to them, but not use them for data themselves (because they would be either inaccurate, and thus result in the loss of bot accuracy, or they would be accurate, in which case the bot most likely has good data already for that situation and doesn't need more).
  • If  one wanted to create artificial intelligence (either "True" AI, combat AI for games, for exploration, or what have you), I think one would just have to program a Markov chain that could recognize actions as the elements making up the chain (ie. Stimulus happens, so take this action.)  One could also maybe create predictive chains, where a bot sees a fire, remembers seeing somebody put their hand in the fire, and realizes that this course of action is more likely to lead to undesirable outcomes (ie. being burned), and thus weights the course of action down.
Those are just a few thoughts, maybe I'll get around to cobbling something like this together to see how well it does.

No comments:

Post a Comment