Monday, 19 October 2009

JQuery, Project Euler and J

My Journey took me down a different path than I thought. I started working on system administration, customization of web applications and development of web applications. Besides liking web.py for lightweight user interfaces to Python programs, I went back into the world of PHP, HTML and JavaScript, where my programming life started.

It began with a pleasant surprise: JQuery.
JQuery makes manipulating the DOM easy and I actually enjoy web development now. I understand CSS layouts a lot better, too and given the choice, I would rather write a web UI than a traditional UI using for instance wxWidgets.

Now that I actually understand closures and object systems, I find JavaScript to be a very elegant language. A very elegant language with many badly hidden warts, that is.


In my quest to learn enough of Python to use it for quick scripts, I also worked on Project Euler from time to time. The problems are mostly very interesting and creating a solution is a nice challenge. I was also confronted with the limits of Python (speed), up to the point where I used SBCL or C and GNU MP lib to get results in an acceptable time.

I will tackle more problems as time permits, but it seems like I already caught most of the low-hanging fruit.


Continuing my tradition of looking at programming languages outside the mainstream, I started playing with J, a descendant of the APL.

It's hard to believe I would one day understand something like
(+/%#)1 2 3 4 5
and consider it elegant, but J did this to me.

I find it very awkward to use for everyday tasks, but for numerical computing it's one of the most beautiful languages I've ever seen. Learning the basics was a lot easier than I thought, the excellent primer offers a nice introduction to the language.

Although I haven't experimented much yet, I'm amazed by the visualization possibilities of J. And most of all: It's fun. That's most important of all things.

Thursday, 9 July 2009

Smalltalk and Lisp aren't for me

After having more exposure to both Smalltalk and (Common) Lisp, I realized that their why of doing things just didn't feel natural. I worked more because I set the goal to learn the language, not because I was passionate about them. It was well worth it and I'm sure I'll have a look at them at some later moment again.

Smalltalk
What a well engineered language! I said "engineered and I meant it. The structure is very logical, the syntax easy and code is well readable.

But everything is foreign. It lives in its own world separated from the rest of my computer and this world is strange. I feel I would have to live in this world, too, if I wanted to use Smalltalk more extensively. It's not for me.

Nonetheless, I recommend everyone to learn a bit about Smalltalk. It helped me a lot to improve my understanding of good object-oriented design, people should look at Smalltalk instead of Java or (like in my case) C++ to learn about OO.

A good starting point is Squeak by example. It helps the user getting started and navigating around in this strange, yet fascinating world that is Smalltalk (Squeak).

Common Lisp

Talking about fascinating, I continue to CL. I've worked with CL from time to time, learning a lot of concepts on the way. Now I finally got hold of a copy of Paul Graham's ANSI Common Lisp, after having read parts of Practical Common Lisp and On Lisp in the past.

While in the past, I was more interested in understanding the possibilities of these concepts the language offers, I actually wrote code this time.

It was surprisingly hard to solve many problems, because I couldn't express my ideas like I wanted. Like in the past, I could describe the solution with words. But translating these words into working code was not as straight-forward as I exprected it to be.

A simple example: Take a string and return a list with the number of occurences for each character, sorted by the number of occurences.

The solution is easy: The solution will be stored in a list of cons-cells with the character and the number of occurences (i.e. an assoc list). To fill this list, look at each character of the string and see if it's already in the list. If it is, increase the number value by one, if not add it to the list with count 1. In the end sort the list by the number of occurences.

Nonetheless it took me quite some time to write the solution in Lisp. A big part was that I had to learn about functions the standard library offers and how exactly they work. The standard library is huge and good documentation seems to be rather sparse, making this more difficult than it should be.

In the end I had two main complaints:

a) Writing these wonderfully concise functions is a lot of work and trial and error. I'm sure it's different for people who think in Lisp, but I found myself to be unproductive. My first steps with Perl, for instance, resulted very fast in something usable. With CL not so much.

b) The library seems to include functions for everything and the function names are very long. This makes code unpleasent to read and actually harder to understand. At this point I would have loved some syntactical sugar for common operations instead of calling a function for everything.

Now what?

About a week ago, I read about a Django tutorial experience and decided to try it out. I was amazed by the ease of use and conciseness and decided to learn Python.

So I started a new project and I continue to be amazed. Python is certainly not perfect, but I find it easy to express ideas and the documentation is very good.

What I like most, is how little work is necessary to do things which are difficult in other languages. I started unit testing, because it is so easy with nose, and it really helped to improve my code.

Functions are generally only few lines with short operators. So far the only thing I missed is pattern matching, which would have made my code easier.

I'm looking forward to doing more with Python, let's see what other things I encounter.

Monday, 6 April 2009

C# and Memoization

I'm finally beginning to understand why people like C# - it's familiar for C++-programmers and it includes some of those powerful constructs from functional programming I was always missing when writing code in C++. Not having to worry about freeing memory is nice, too.

Imagine my excitement when we had to write an easy integer factorizer to simulate a workload task related to threading. I knew I could do closures, so it would be trivial to do memoization to speed up this task, too.

Let's start with the following simple function to check if a number is a prime number. Note the caching of the square root - otherwise it would calculate the square root with every loop - and that's expensive.
static bool isPrim(ulong n) {
if (0 == (n % 2)) {
return false;
}
double x = Math.Sqrt(n);
for (ulong i = 3; i <= x; i += 2) {
if (0 == (n % i)) {
return false;
}
}
return true;
}


We can easily transform this to a generated function - delegates are nice and timer benchmarks show no measureable performance loss. There are no changes to other code necessary.

The generated function is needed to have a closure. This means any variables inside the generator functions will be stored with the generated function and can be accessed and modified when the function is called. Imagine for instance a counter function, where this behaviour is very useful.

But for now we only rewrite the function to be generated.
static isPrimDelegate isPrim = isPrimGenerator();
delegate bool isPrimDelegate(ulong n);

static isPrimDelegate isPrimGenerator() {
return delegate(ulong n)
{
if (0 == (n % 2)) {
return false;
}
double x = Math.Sqrt(n);
for (ulong i = 3; i <= x; i += 2) {
if (0 == (n % i)) {
return false;
}
}
return true;
};
}


We can easily see that the code could be better. It shouldn't test division for 15 for instance, because everything dividable by 15 is dividable by 3 and 5, too. We could make it faster by only checking for division by prime numbers - and that's just what memoization helps us to do.

We'll save a list with prime numbers along with the function and check only for division by these numbers. This way, we'll have the work of finding primes only once and we'll get a nice speed up.

As you see, adding the list to the function is as simple as declaring a variable in the generator function. We add the primes 2 and 3 to make the later code simpler, adding more primes up-front didn't result in any faster code for me.

Now we check for division with all primes in the list. If we reach the end of the list before we checked all primes in the range, we lazily search for the next prime. We still check for 15 when searching for primes, but this will be done only once and not every time.
static isPrimDelegate isPrim = isPrimGenerator();
delegate bool isPrimDelegate(ulong n);

static isPrimDelegate isPrimGenerator() {
List prims = new List();
prims.Add(2);
prims.Add(3);
return delegate(ulong n)
{
int index = 0;
ulong m = prims[index];
double x = Math.Sqrt(n);
while (m <= x) {
if (0 == (n % m)) {
return false;
}
if (prims.Count == index) {
int index2 = 0;
bool prim = false;
while (!prim) {
m += 2;
prim = true;
double x2 = Math.Sqrt(m);
while (prims[index2] <= x2) {
if (0 == (m % prims[index2])) {
prim = false;
break;
}
index2++;
}
}
prims.Add(m);
} else {
m = prims[index++];
}
}
return true;
};
}

We end up with faster code by trading memory for CPU time. In most cases it's worth it as the second call to a function will be near-instanteous, if all primes up to the square root of the parameter have already been cached.

This can also be applied to other functions, like costly data base lookups where the result doesn't change, but is needed multiple times. As it is transparent to the caller it's an obvious optimization to make, when profiling finds a function slowing down the code.

Impressions of Squeak

It's slow. Granted, this was on a 450 MHz system which I accessed via X-forwarding, but I'll better prepare the image on my faster computer and try to re-assess performance on a headless session.

Apart from being slow, it has also questionable looks with very bright colours and a comic-style font. But behind these looks...there's a system which seems very well thought-out and is intuitive to use for a first time user.

Installation of programs is made easy by an integrated package manager. Everything is graphical and clear, although it can be confusing with many windows opened.

Now I'll have to learn more about Smalltalk and using Squeak before I have a closer look at AIDA/Web and Seaside.

Thursday, 2 April 2009

First stop - Smalltalk

The first frameworks I'll look at will be written in Smalltalk. I set up an old PC (450MHz Celeron, 256MB RAM) with Debian Lenny and installed Squeak and also got an old (1996, pre Squeak) Smalltalk book from the library to learn enough of the language.

A short look at the two available frameworks:

Seaside

This framework differs from others by managing state using continuations, which makes it very interesting. It seems to attract developers as shown by projects such as the CMS pier and the related meta-description framework magritte (If I understand it correctly, it servers to create re-usable models of re-occuring tasks like data validation.). Recent versions advertise a big reduction of memory usage, which seems to have been a problem in the past.

20090410: To gain a better understanding of continuations in Seaside, have a look at Julian Fitzell's blog post. It's very interesting, thanks to him for pointing to it.

AIDA/Web

AIDA/Web was first presented in 1996 and still actively in development. It seems to be very mature and powers for instance the Squeak home page. It is based on MVC, supports Ajax and even includes a web server. It advertises integrated session and security support and REST-like urls, but I have yet to find out how they do it. It claims simplicity yet to allow for complex applications, just like the language Smalltalk.

Wednesday, 1 April 2009

Where the journey starts and where it might lead.

We're at day 0 of this project, so let's have a look at where I stand now.

I'm a student of commercial information systems. During my studies and in intern and open source jobs, I've mostly worked with multi-platform or Unix/Linux applications in C/C++, but also touched Java and C#. In the time before, I've briefly worked with XHTML, CSS, PHP and MySQL during an internship, back when nobody really knew what AJAX meant.

About two years ago, due to a strong interest in linguistics, I became interested in functional programming and experimented with languages like Erlang, Haskell and finally Common Lisp.

You should therefore expect lesser known frameworks on my list, written in more obscure languages. This includes Furnace written in Factor or Seaside written in Smalltalk.

Depending on how I like a framework and the language it's written in, I might not write much about it. And I don't know yet what to write, but we're only at day 0 and there's no reason to worry about this now. Let's see, I'm looking forward to this experience.