Friday, July 4, 2014

Using Excel as your UI

At work, part of our job involves getting data from a bunch of different databases and verifying that certain relationships hold among them, or finding other kinds of problems or bugs; we usually end up pasting our results into Excel, and sending the spreadsheets around.

One of our teammates had the idea of writing a script that writes directly to Excel; I'm still amazed it didn't occur to me :) but to compensate, I figured I'd write a program to do the same; since I couldn't find any simple .net libraries to write to excel, I figured I'd try with python; found openpyxl (great simple library, although its docs are outdated), so I wrote a simple script that takes queries in JSON and produces an excel spreadsheet with the data produced by those queries.
The Json file looks like this:
{
"sheets": [
{
"title": "P&P",
"queries": [
{
"db_driver": "psycopg2",
"db_conn_str": "dbname=curri user=curri",
"title": "People",
"query": "SELECT * FROM Person"
},
{
"db_driver": "psycopg2",
"db_conn_str": "dbname=curri user=curri",
"title": "Products",
"query": "SELECT * FROM Products"
}
]
},
{
"title": "Rev",
"queries": [
{
"db_driver": "psycopg2",
"db_conn_str": "dbname=curri user=curri",
"title": "People",
"query": "SELECT * FROM Person ORDER BY Id DESC"
},
{
"db_driver": "psycopg2",
"db_conn_str": "dbname=curri user=curri",
"title": "Products",
"query": "SELECT * FROM Products"
}
]
}
]
}c
view raw dbs.json hosted with ❤ by GitHub
Notice we can specify several sheets, and for every sheet we specify queries; for each query we specify its DBI driver (you need to have those libraries available in your system), connection string and query string.
The python code is amazingly simple:
import sys
import json
from openpyxl import Workbook
#from openpyxl.cell import get_column_letter
from openpyxl.styles import Color, Style, PatternFill
headerStyle=Style(fill=PatternFill(patternType='solid', fgColor=Color('008080FF')))
def query2xl(query, sheet, firstRow, headerStyle):
driver=__import__(query['db_driver'])
db=driver.connect(query['db_conn_str'])
cur=db.cursor()
cur.execute(query['query'])
theRow=firstRow+1;
sheet.cell(row=theRow,column=1).value=query['title']
#theRow=theRow+1
for f,field in enumerate(cur.description):
sheet.cell(row=theRow,column=f+2).value=field.name
sheet.cell(row=theRow,column=f+2).style=headerStyle
for r,row in enumerate(cur.fetchall()):
theRow=theRow+1
for c,field in enumerate(row):
sheet.cell(row=theRow,column=c+2).value=field
cur.close()
db.close()
return theRow+2;
def addSheetToWorkbook(sheet, wkBook):
theRow=0
ws=wkBook.create_sheet()
ws.title=sheet['title']
for q,query in enumerate(sheet['queries']):
theRow=query2xl(query, ws,theRow,headerStyle)
def mkWorkBook(sheets):
wb = Workbook()
for s,sheet in enumerate(sheets):
addSheetToWorkbook(sheet,wb)
return wb
if __name__ == '__main__':
spec=json.loads(open(sys.argv[1],'r').read())
wb=mkWorkBook(spec['sheets'])
wb.save(filename = sys.argv[2])
view raw jsonquery2xl.py hosted with ❤ by GitHub
And you call it by passing it the name of the json file, and the name of the output you want.