Friday, July 4, 2014

Using Excel as your UI

At work, part of our job involves getting data from a bunch of different databases and verifying that certain relationships hold among them, or finding other kinds of problems or bugs; we usually end up pasting our results into Excel, and sending the spreadsheets around.

One of our teammates had the idea of writing a script that writes directly to Excel; I'm still amazed it didn't occur to me :) but to compensate, I figured I'd write a program to do the same; since I couldn't find any simple .net libraries to write to excel, I figured I'd try with python; found openpyxl (great simple library, although its docs are outdated), so I wrote a simple script that takes queries in JSON and produces an excel spreadsheet with the data produced by those queries.
The Json file looks like this:
Notice we can specify several sheets, and for every sheet we specify queries; for each query we specify its DBI driver (you need to have those libraries available in your system), connection string and query string.
The python code is amazingly simple:
And you call it by passing it the name of the json file, and the name of the output you want.

Monday, June 9, 2014

NodeJs tips and Tricks

I'm relearning nodejs, for fun; I'm writing here some links I've found useful (more as a reminder to myself)


Monday, May 26, 2014

Playing with nginx and postgresql

I've always used Apache as my web server (and now MS IIS :), but have wanted to play with nginx; I was looking for something to do with my Raspberry Pi, so I figured setting up nginx's postgresql module would be fun. I followed the instructions for OpenResty (a 'distribution' of nginx with a bunch of modules configured by default; not a lot of documentation, but enough to get started).

I was running raspbian (a debian derivative), so it was as simple as:
apt-get install libreadline-dev libncurses5-dev libpcre3-dev libssl-dev perl make libpq-dev
(notice libpq-dev is not in the docs, but it is needed for the postgresql module, which is not installed by default with openresty)

then download with wget (this is the latest version at the time)
wget http://openresty.org/download/ngx_openresty-1.5.12.1.tar.gz

Uncompress, configure, make and install
tar -xzf ngx_openresty-1.5.12.1.tar.gz
cd openresty-1.5.12.1
./configure --with-luajit --with-http_postgress_module && make && sudo make install

nginx gets installed into /usr/local/openresty/nginx , so we can edit the configuration file with:
sudo vim /usr/local/openresty/nginx/nginx.conf

(I was using vim; use your favorite text editor otherwise :)

And we get to the fun part ; in nginx, we use the server stanza to configure a server; inside the http section, we use upstream to specify how to connect to the database (db is just a name, which we'll use in our other directives)
upstream db {
    postgres_server 127.0.0.1 dbname=test user=test password=abc123;
}

And then inside the server section, we can define specific queries for certain locations; for example:
location /stores {
    rds_json          on;
    postgres_pass   db;
    postgres_query  "SELECT * FROM Stores";
}
Specifies that when the user visits /stores, we should return (as json) the result of executing SELECT * from Stores.

Tuesday, May 13, 2014

Getting running queries in SQL Server

Today I needed to get running queries in SQL server (you can kill them with the command KILL followed by the session id). After 'binging' it, got to the following query :). Need to study more about the metaschema in sql server.

SELECT session_id,
       request_id,
       Db_name(database_id),
       start_time,
       status,
       command,
       Substring(txt.TEXT, ( statement_start_offset / 2 ) + 1,
       ( ( CASE statement_end_offset
       WHEN -1 THEN Datalength(txt.TEXT)
       ELSE statement_end_offset
                                                                   END
           - statement_start_offset ) / 2 ) + 1) AS statement_text,
       wait_type,
       wait_time,
       blocking_session_id,
       percent_complete,
       cpu_time,
       reads,
       writes,
       logical_reads,
       row_count
FROM   sys.dm_exec_requests
       CROSS APPLY sys.Dm_exec_sql_text([sql_handle]) AS txt
WHERE  session_id <> @@SPID
       AND session_id > 50

Sunday, April 27, 2014

Automatas (finite state machines): A programmer's perspective

Finite state machines are simple and nifty; when I was teaching theory, I saw students, even good programmers, had trouble with them when presented as math, so I figure showing them as programs could make it click for some people.

From http://cg.scs.carleton.ca/~michiel/TheoryOfComputation/ (a free ToC book)

Definition 2.2.1 A finite automaton is a 5-tuple M = (Q, Σ, δ, q, F), where
1. Q is a finite set, whose elements are called states,
2. Σ is a finite set, called the alphabet; the elements of Σ are called symbols,
3. δ : Q × Σ → Q is a function, called the transition function,
4. q is an element of Q; it is called the start state,
5. F is a subset of Q; the elements of F are called accept states.

Doesn't that sound fancy ? It means we have an object with 5 fields (a 5-tuple), Q, a set of things called states, Σ, a set of symbols called the alphabet (we'll use characters as our symbols), δ, a function that takes a state and a symbol and returns a state (conceptually we are 'moving' from one state to another when we see a symbol), q, an element of Q , called the start state and F a subset of Q, the set of accepting states.

In Scala, we can represent it as:


We can then define the extended transition function, which says which state we end up at by following a string, recursively as
This means if the string is empty, we stay at the state we're at; if not, we follow the first character (by using delta), and then follow the rest of the string from there; this recursive definition maps nicely to a List in scala (or any other language with cons-style lists :), so we end up with the following code:


And the automata accepts a string if we would go from the initial state to the final state by following that string; in Scala
To use it we would first define a delta function, say:
And then we can use like this:
The full code is at https://github.com/okaram/scala/tree/master/automata, including non-deterministic automata (which I will probably get around to blog about some other time :)

Thursday, April 17, 2014

Accessing Databases with R (RStudio)

R has libraries for accessing several databases, but if you're on Windows, you'd probably prefer the RODBC package; with this, you can access almost any db through ODBC. You can create a DSN and use odbcConnect, or use odbcDriverconnect ; the problem is figuring out the connection string, especially since you need to specify the driver. For SQL Server, this works (assuming db is in localhost):


library(RODBC)
c=odbcDriverConnect('Driver=SQL Server;Server=localhost;Database=......;Trusted_Connection=True')

And then you can do odbcquery like:
res=sqlQuery(c,"SELECT * FROM ....")

And you get a data frame !

http://www.connectionstrings.com/ has odbc connection strings for many dbs

Wednesday, April 2, 2014

Super quick Linq to XML

I was trying to figure out how to get started with Linq to XML, and found this article. I figured I'd simplify it even more for myself :)
Imagine we have the following file, saved in c:\contacts.xml

Then we can access it as follows
Notice:

  • on an XDocument, we can call descendants and pass it an element name, to get all the elements with that name (we can also call the no-args version, and get all descendants)
  • What we're getting is an XElement, on which we can call methods like Element or Attribute
  • We can directly cast an element into a string, and it gets us all the text inside that element
  • We can cast an attribute as a string, or an int (or even a float, double or many other types)