A programmer's blog: 2012

Saturday, December 29, 2012

iOS first App: Temperature Conversion

Apple's first app tutorial is excellent, but many times having more examples help, so here's my simple first app tutorial, a very simple temperature conversion application; It has one input, a text field for entering a temperature in Fahrenheit degrees, and a button, that will convert it to celsius and display it. Check the short demo video below.

If you prefer videos, here's a video walkthrough; if not, follow allong.

First, we create a new project, single-view application;let's call it Temperature Conversion

Then fill out the other project options as below (make sure you select use storyboard and use reference counting, and use Temp as class prefix, to make the class names the same as in this walkthrough; select iphone for device, to simplify; put whatever you prefer for the other values).

After you choose where to save your project, you should see the full xcode workspace, similar to the screenshot below (which has extra annotations :). Notice the list (or hierarchy) of files is to the left, in the middle there's an editor, wich changes depending on the file selected, and on the right there's a secondary editor, plus the object library at the bottom.

First, select the TempViewController.h file and modify the code so it looks like the screenshot below (the actual code is below the screenshot)

Then, select the storyboard, and, from the object library (bottom right, notice this is a list, and you can scroll) drag a text field, a label and a button onto the storyboard.

Now, ctrl-click on the text field, to get its menu, and drag from the new referencing outlet to the Temp View Controller, and select fahrenheit, the only option (a line will appear when you drag, but doesn't appear in the screenshot)

Now do the same with the label

And finally join the touch up inside event in the button with the convertToCelsius method in the controller

And now we go to the TempViewController.m file to add the actual code; for easier understanding (not terribly useful now, but useful in larger programs), we first create a method for converting from fahrenheit to celsius, which I'm calling faren2celsius
And then the actual code for handling the button press; we access our properties with dot notation and using self to refer to ourselves, so self.fahrenheit is the text field, and self.celsius is the label; they both have a text property, of type NSString, and the NSString class has an intValue method that parses a string into an int.

Now press command-R or click on the run button, and you get your first iPhone app !

Finishing touches

Go back to the storyboard, and add a couple of labels that say Fahrenheit and Celsius, next to each field.
Change the title of the button to Convert to Celsius
Change the text inside the label so it is blank, instead of saying 'label'.

Extending the program

Change the declaration of celsius in TempViewController.h to be a textfield, and, on the storyboard, delete the label, and another textfield and link it to the view controller.
Add another method in TempViewController.m (and declare it in TempViewController.m), called convertToFarenheit, that will convert from celsius (which is now a textfield) and put it into the fahrenheit field.
Add another button to the storyboard, and link it to this method.
Verify that you can convert back and forth between fahrenheit and celsius.

Simple animations in iOS

I'm re-learning objective-C and iOS programming, since I will be teaching a class this semester; I have created a simple animation example that might be useful to other people learning iOS.

There are basically two kinds of animation:

cell-based animation, in which we replace an image with another image, like old cartoons were made, or like flip books; this gives us complete control of the animation, but requires us to draw each frame.
property-based animation, in which we animate some properties of an object (like its location, its size etc), this is usually limited, in which we are animating the whole object, but it is a lot easier, since we do not need to draw each frame, but just need to specify the initial and final values for the properties.

We can create a very simple program to demonstrate both kinds; we will have a storyboard, with an UIImageView and two buttons, one to trigger each kind of animation. The video below shows the screen, and has a small demo.

We create a new project, single-view application, and then edit its storyboard (I do iPhone only for my simple projects :), add a UIImageView and two buttons; on its view controller, we add two properties (only one IBOutlet) and two IBAction methods, one for each button, as follows:

For the cell animation, I got a character sheet from OpenClipart, and cut it into a set of images (the images are not quite properly aligned, but they definitely exceed my artistic abilities :). You can get the images (and the whole project) from my github.

Notice we have an NSArray property for keeping the images for the cell animation, but it is not directly hooked to the UI.

After we attach the ImageView (as a reference outlet) to the imgView property, and hook up the buttons to the corresponding methods, we go into the code.

In our viewDidLoad method we initialize the array of images:

And our method for cell animation is pretty simple; all we really need to do is set the animationImages property of the UIImageView, to an array of images, and then call startAnimating on the ImageView; we can also control the duration (in seconds), and how many times we want the animation repeated. The whole method is below:

Now, for the property animation, we need to choose a property which is animatable (from the documentation, we can animate the frame, bounds, center, transform, alpha, backgroundColor and contentStretch properties). The frame property determines (along with some of the others :) where the UIImageView is drawn, so we create a new frame (you need to animate the whole property, can't animate pieces), and use UIView's animateWithDuration method (available since iOS 4); we need to pass it a block (using the ^ { ...} notation; anything between the braces {} is a block). There are several variations of this method, which allow you to control more of the animation). The code is as follows:

If you want to learn more about animations in iOS, Ray Wenderlich has a deeper tutorial, Apple has all the documentation you'll ever need, and playing with the methods should be easy and fun.

Friday, December 21, 2012

Data Science Resources

I will be collecting resources here about big data, data warehousing, data mining and such. Just my personal list.

Mining of massive datasets
Greenplum's Data Science Summit (not great, just intro?)
MongoDB in action
Taxonomy of Data Science - decent blog post from dataists
http://datasciencelondon.org/

Papers

Mapreduce paper
http://atbrox.com/2010/05/08/mapreduce-hadoop-algorithms-in-academic-papers-may-2010-update/
http://en.wikipedia.org/wiki/Apache_Hadoop#Papers
http://lemire.me/OLAP/
www.indiana.edu/~cheminfo/I533/datacube.pdf

Saturday, December 1, 2012

Getting Started with Hadoop on Ubuntu

So, I'm trying to play with hadoop again (haven't done it in a while), and since ubuntu is my current weapon of choice, I found a great tutorial at http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ ,but I wanted something even simpler, like a script, plus a sample program and instructions on how to compile and run it, so I created it (at the very bottom are the differences with Noll's tutorial). It is available at: https://github.com/okaram/scripts/blob/master/hadoop/install.sh .

You just need to download it and change it so it can be executable: You probably want to look at it in your favorite editor (it is not a good idea to just run a script from the internet; I trust myself, but you shouldn't trust me), and you may want to change the mirror while you're at it (I live in Atlanta, so I use Georgia Tech's). After you're happy, run it as root: And you should be done with the installation ! the script creates a user for hadoop, called hduser; you can change to it, by typing: Then, as that user, you want to setup your path and classpath (the classpath is needed for compiling): And start hadoop: Now download my sample program (it is the standard WordCount example, from the tutorial, but without the package statement, so you can compile it directly from that folder), compile it and create a jar file: Now, we need to put some data into hadoop; first we create a folder and copy a file into it (our same WordCount.java, since we just need a text file): And we copy that folder into hadoop (and list it, to verify it's there): And now we can run our program in hadoop: When you want to stop hadoop, just run the stop-all.sh command; also, if you want to copy the output to your file system, just use the -copyToLocal option of hadoop's dfs.
The install script is completely automated, so you can even use it to start an amazon ec2 instance with it; for example, use: to start a micro instance, with a ubuntu 12.04 daily build (for Dec-1-2012; change the ami id to get a different one :), and a key named mac-vm.

Sunday, November 18, 2012

Using LinQ in C#

LINQ is actually a pretty cool technology; it allows you to do something like list or monad comprehensions over collections, but with a sql-like syntax; better yet, it does magic so you can use SQL tables instead of collections, and it actually generates and sends the SQL to the DBMS, instead of getting all the data into memory and iterating over it.
Since I switch languages constantly, I'm always trying to remember the right syntax; so, this is my place to have it, for now. Before you can connect to your DB, you need to run a VS tool, that will generate the LINQ for your particular schema (and you need to rerun every time you change :( . In my case, the generated code is in the SimpleBlog package, and the generated class is SimpleBlogDataContext; so my code for reading something looks like:
And to add an element to the blog table, we can do:

Notice that the context (ctx here) needs to be a global variable; having several contexts will diminish performance and probably confuse you, since variables from one context cannot be saved in another.

We always need to import the System.Linq namespace when using linq; If we're doing this directly in an aspx page we use an import directive, like: Notice that you need to add an assembly reference (at least in VS2008, you need to add it to the web.config file, even if you add it to the project references in solution explorer), so add the following line to your web.config, inside the <assemblies> tag: <add assembly="System.Data.Linq, Version=3.5.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />

Thursday, November 15, 2012

A simple blogroll with python and feedparser

A friend wanted to use a blog as a sort of database, allowing several people to create posts on the same blog, and then transform the posts into some sort of listing; most blogging software will let you export a list of posts as an rss feed, so this seemed like a good time to go learn how to parse and use rss.

While I'll let my friend figure out his own application, this prompted me to write a simple 'blogroll' program, that would read several rss (and atom) feeds and would produce html showing the appropriate titles and links.

Since python is my scripting language of choice for now, I googled rss parser libraries, and found feedparser.

We can call feedparser.parse, and give it a url; it will then return an object representing the feed; the object contains a field called feed, which contains information about the feed, like its title and its link (url); the object also contains entries, which is a list of objects, each representing one entry in the rss feed; for each entry, you have fields like its title and its link.

So, it is just a matter of iterating over a list of urls, parsing the feed for each, and going over its entries, producing html as we go, writing everything to a file. The final code looks like:

Sunday, October 14, 2012

Getting started with Amazon's DynamoDB in Python

Amazon provides many cloud services, both for Infrastructure as a Service (running Virtual Machines in EC2) and Platform as a Service (providing the actual services). One of the services is DynamoDB, a simple NoSQL database, hosted on Amazon's servers.DynamoDB, like many NoSQL Databases is a simple key-value storage, with some added capabilities.
Basically, you store objects (representable in JSON), matched to a key. Your key can be a string or an int (and actually, DynamoDB allows your key to be composed of two fields, a hash and a range, but here we will only do the hash). Your object is just a dictionary mapping fields to values.
The best python library for DynamoDB (and for most other aws services) is boto. The easiest way to install boto is with pip; just do pip install boto or sudo pip install boto.
Amazon offers a free tier for DynamoDB, which should be enough for playing and small applications (the DB has to be less than 100M in size, and you get 5KB in reads and 1KB in writes per second), so head down to aws.amazon.com and create your account, if you don't have it; after making your account, go to security credentials and find out your access key id and your secret access key. You may want to store your credentials in a file called .boto in your home folder, as explained here. After that, you can connect to your dynamodb as follows:
Now, let's try to create one table. Our table will be called users, and its key will be a string. We first need to create a schema (the schema only specifies the name and type of your key; your 'table' can contain any kind of objects, with any attributes), and then use that schema to create a table, as follows:
Once we have created the table, we go onto adding items; we create a dictionary containing our data, then use the new_item method of our table to create the item, and finally the put method of that item to actually add it to the database, as follows: And, after creating several items we can obtain an element (knowing its key) with the get_item method of a table, as follows (notice it will throw an exception if the key doesn't exist): To modify the item in the DB, we would just modify it in memory (it looks like a dictionary) and then call its put method again.
Most of the times you will be searching for a given key (or at least will know a part of the key), but if not, you can use the scan method (notice this method goes through EVERY object on the table, and so may be slow and consume a lot of I/O). You can pass a list of conditions (as of now, the documentation is incorrect, stating you specify the conditions with strings; you need to use elements coming from boto.dynamodb.condition). A scan would look like: For more information, check:

Or come back :) I should be blogging more about dynamodb (and other aws services) soon.

Friday, October 12, 2012

Cons lists in C++

Lisp and other functional languages have purely-functional lists; that is, lists that cannot be changed; this leads to a very different style of programming, and can save memory and time, since parts of the list can be shared among several lists; with the new std::shared_ptr in C++, implementing such a list is relatively simple.

Although lisp lists allow mixing of types (and so are equivalent to binary trees), we will implement typed lists, which allow elements of only one type, using templates. We want to provide 4 functions:

cons, which takes an element and a list, and produces another list, containing that element as its first element (and the passed list as the elements after the first element; notice we're going to share this passed list).
car, which takes a list and returns its first element (sometimes called head).
cdr, which takes a list and returns another list, the list minus the first element (sometimes called tail).
isEmpty, which returns true if the list is empty, false otherwise.

Since cons is the operation that makes the list, I am calling my node ConsNode and my list class ConsList (btw, boost has a somewhat similar structure, using template metaprogramming, which means it can only be defined at compile time; see bost::const). Our Node class can be defined as:
And we can define our list class, simply as a synonym for shared_ptr (notice that both shared_ptr and template typedefs are c++11 features; template typedefs are implemented in g++4.7 and newer). The shared_ptr is a smart pointer, that basically implements a reference-counter pointer; several shared_ptrs can point to the same dynamically allocated chunk of memory; every time one is destroyed (because it goes out of scope), the counter is decremented; when the counter reaches 0, the memory is deallocated by calling delete. This allows us to not worry about calling delete manually.
Now, cons is the function that will create a new list, given an old list and a new element (notice this does NOT change the original list at all; we want to avoid modifying the lists, so they can be shared as parts of other lists). Our cons function is going to call std::make_shared, which creates a new reference-counted chunk of memory (you can create a shared_ptr from a plain dynamically allocated pointer, but at the cost of an extra call to new). Our cons function looks like: And car, cdr and isEmpty are simply defined as: For convenience, we can define an operator<< ; notice we're using car, cdr and isEmpty, and using recursion. With car, cdr and isEmpty, we can define functions recursively, like we usually would do in lisp or scheme (well, with slightly ugly template syntax :). Below are examples for len (which returns the length of a list) and sum (which adds up all the elements of a list of numbers). And we could use lists in a main function, like below (notice how the lists share a lot of the storage, which saves both time and space).

Thursday, October 4, 2012

Functional sets in C++

I am taking a functional programming class. One of the exercises involves defining sets in terms of functions; since C++ now supports lambdas (and closures), I figured it would be interesting to implement them in C++ (the assignment was in a different language. It is similar to assignments in SICP)

The idea is to define a set (of integers, to simplify), as a function; you pass the element and it returns whether the element is in the set or not. So, we can define a Set as a function that takes a boolean and returns an int. I also define a predicate, which is identical to a set. BTW, I'm using the new std::function type.
Now, we can define a singletonSet function, that takes an int, and returns a set that contains that int. We need to create a function on the fly, at runtime, and that's what lambdas let us do. We define a lambda with the [] syntax, and the = inside means to capture any variables (make a closure) by value. In this case, we're capturing the elem variable, so every call to singletonSet returns a new lambda, with the right value of elem!
And with that, we can define the union of two sets, as a new function (lambda) that returns true if either set contains the elements, and the intersection as a function that returns true if both sets contain the element, as in the following code:
You could use singletonSet and set_union, as follows:
And set_intersection as follows:
The assignment had a couple other functions (foreach and exists), and you can find the full source, some unit tests and a demo at my github