Sunday, April 27, 2014

Automatas (finite state machines): A programmer's perspective

Finite state machines are simple and nifty; when I was teaching theory, I saw students, even good programmers, had trouble with them when presented as math, so I figure showing them as programs could make it click for some people.

From (a free ToC book)

Definition 2.2.1 A finite automaton is a 5-tuple M = (Q, Σ, δ, q, F), where
1. Q is a finite set, whose elements are called states,
2. Σ is a finite set, called the alphabet; the elements of Σ are called symbols,
3. δ : Q × Σ → Q is a function, called the transition function,
4. q is an element of Q; it is called the start state,
5. F is a subset of Q; the elements of F are called accept states.

Doesn't that sound fancy ? It means we have an object with 5 fields (a 5-tuple), Q, a set of things called states, Σ, a set of symbols called the alphabet (we'll use characters as our symbols), δ, a function that takes a state and a symbol and returns a state (conceptually we are 'moving' from one state to another when we see a symbol), q, an element of Q , called the start state and F a subset of Q, the set of accepting states.

In Scala, we can represent it as:

We can then define the extended transition function, which says which state we end up at by following a string, recursively as
This means if the string is empty, we stay at the state we're at; if not, we follow the first character (by using delta), and then follow the rest of the string from there; this recursive definition maps nicely to a List in scala (or any other language with cons-style lists :), so we end up with the following code:

And the automata accepts a string if we would go from the initial state to the final state by following that string; in Scala
To use it we would first define a delta function, say:
And then we can use like this:
The full code is at, including non-deterministic automata (which I will probably get around to blog about some other time :)

Thursday, April 17, 2014

Accessing Databases with R (RStudio)

R has libraries for accessing several databases, but if you're on Windows, you'd probably prefer the RODBC package; with this, you can access almost any db through ODBC. You can create a DSN and use odbcConnect, or use odbcDriverconnect ; the problem is figuring out the connection string, especially since you need to specify the driver. For SQL Server, this works (assuming db is in localhost):

c=odbcDriverConnect('Driver=SQL Server;Server=localhost;Database=......;Trusted_Connection=True')

And then you can do odbcquery like:
res=sqlQuery(c,"SELECT * FROM ....")

And you get a data frame ! has odbc connection strings for many dbs

Wednesday, April 2, 2014

Super quick Linq to XML

I was trying to figure out how to get started with Linq to XML, and found this article. I figured I'd simplify it even more for myself :)
Imagine we have the following file, saved in c:\contacts.xml

Then we can access it as follows

  • on an XDocument, we can call descendants and pass it an element name, to get all the elements with that name (we can also call the no-args version, and get all descendants)
  • What we're getting is an XElement, on which we can call methods like Element or Attribute
  • We can directly cast an element into a string, and it gets us all the text inside that element
  • We can cast an attribute as a string, or an int (or even a float, double or many other types)