Thursday, June 01, 2006

Use of non-basic techniques

I have questioned myself about this several times, and more often than I'd like, the conclusion I come to is not the one I would wish.

  • Is it viable, or even responsible to use "advanced" programming techniques in the workplace ?

The problem here is related to how people use the languages they have at their disposal.
There is a recurring joke about how C++ was named, basically pointing that the name of the language itself is a post-increment operator, and that is the reason so many people use C++ just as they used C (or like a hybrid I jokingly call C+/- ).
C++ is widely regarded as a language that is difficult to master. As such, is it a good idea to use, for example, template metaprogramming, when there is a (sadly strong) possibility that someone that isn't very confortable with regular templates will have to look at your code ?
One could argue that if the code is well designed (and eventualy bug-free), the implementation details should be just details and the other users wouldn't need to know them. But considering that someoone else will probably have to maintain your software, is it really wise to use techniques that might take some time for them to understant ?

From a purist point of view, I believe it is a good idea to do so, because it will probably force your colleagues learn something new, and thus, become better professionals.
On the other hand though, the saying "time is money" springs to mind.
Sometimes, this time might not even be available, and we all know what tight deadlines can do to code quality ...

What is you opinion about this issue ?

Monday, November 21, 2005

Using AspectJ to cache function call return values

Caching method results is an easy way of improving performance when you're dealing with slow functions, such as:
- fetching database results;
- complex mathematical calculations;
- poorly designed algorithms that you're too lazy to improve :-)

Wouldn't it be nice if the programming language you're using allowed to cache method results automatically?
As far as I know only Perl and Python allow this (maybe I'll cover them in future posts). What about Java? Are we doomed to implement caching over and over?

AspectJ allows you to intercept method calls (along with many other useful things), so what about creating an aspect which would intercept that really slow method, check ed if the method had been called before with the same arguments and returned the previous returned value without calling the function?

So, to test the concept I had to create a small program that called a slow function over and over with the same arguments.

The following program calculates 1000! a thousand times


import java.math.BigInteger;

public class Factorial {
public static void main(String[] args) {
BigInteger n = BigInteger.ONE;
Factorial f= new Factorial();
long begin = System.currentTimeMillis();

for(int i = 0; i != 1000; ++i )
n = f.fact(1000);

long end = System.currentTimeMillis();
System.out.println(n);
System.out.println("Took "
+ (end - begin) / 1000.0
+ " seconds");
}

public BigInteger fact(int value) {
if(value == 0 || value == 1)
return new BigInteger("1");
BigInteger n = new BigInteger(
Integer.toString(value));

for(int i = value - 1; i != 1; --i)
n = n.multiply(new BigInteger(
Integer.toString(i)));

return n;
}
}


On my machine (PowerPc 7447, 1.2 ghz), the loop took about 16,6 seconds to run.

Now, I'll present two simple aspects:

Generic caching code :
- intercept method calls
- generate an hash key for the method arguments concatenating the string representation of all the arguments
- On the first call, store the method return value on the hash table and return it
- On subsequent calls return the cached value


import java.util.HashMap;

public abstract aspect Cache pertarget(method()){
HashMap<String, Object> cache =
new HashMap<String, Object>();

String hashKey(Object[] arguments) {
String key = "";
for (Object object : arguments)
key += object;

return key;
}

Object around(): method() {
String key = hashKey(thisJoinPoint.getArgs());
if (cache.containsKey(key)) {
return cache.get(key);
} else {
Object value = proceed();
cache.put(key,value);
return value;
}
}

abstract pointcut method();
}


Now, there's another aspect to intercept the factorial calculation


import java.math.BigInteger;

public aspect CacheFactorial extends Cache {
pointcut method() :
call(BigInteger Factorial.fact(int));
}


Now the same exact program only takes 0,11 seconds! That's a really good improvement with almost no effort thanks to the power of AspectJ.

Now, before using this code, please note that:
- the cache will grow uncontrolled over time (unless you're using the same argument, but in that case you don't need a cache at all), exhausting all available memory.
- beware of collisions with the key generating code. There's no point in getting the wrong answer really fast.
- The cache doesn't expire. If you're caching a database query, you'll probably want to refresh the cache now and then.

Do not use this code as it is. I'll improve it in future posts. Stay tuned!


There are thousands of articles out there about the subject.
There's More Than One Way To Do It! :-)

Tuesday, November 15, 2005

Language Features vs. Language as Platform

I've had several heated arguments in the past on "Which is the Best Programming Language"

People are passionate about their personal preferences, and sometimes defend them with zealotry.
They go on listing the wonderful features of language X, while pointing all the problems with language Y. While these discussions can make sense in the academic community or in some niche market, when we look at the big picture this is nothing but nitpicking.

There's a lot more to a programming language that it's grammar.
Say, you're going to start a new project/company. Being the project manager you're free to choose whatever you feel necessary to implement it (most of the time you can't - you have to live within the constraints of your working environment, but for the sake of example lets assume you can).

What you should be asking yourself?

- Is it cleanly object oriented?
- Does it support aspect oriented programming?
- Is it compiled/interpreted?
- Does it support generic programming?
- Does it support design by contract?
- (insert your favorite language feature here)

No. In my opinion, the important questions (in no particular order of importance) are :

- Is it available/supported on my target environment? (operating system, architecture...)
Hey, this programming language is so great, but my potential customers are using Windows, Mac OS X, Linux.... Can I run it there?

- Does it have an active community working community?
Internet forums are a priceless resource for learning. Google is your friend.

- How rich is the bundled library?
The richer the library the less you have to reinvent the wheel.

- Is it easy to use?
On the other hand there's no point in having a rich library if your team doesn't know how to use it.

- Is it easy to integrate with other tools?
Can you access databases? Webservices? Operating system features?
Do you need third party libraries to make integration possible? Are they available on the target platform? (...)

- Is it easy to find knowledgeable programmers?
Alone you can only go so far. You need a team. Do you already have them? Are they motivated to use language/tools X? Even if you already have them, they will get bored. People will move to work on what interests them the most. Can you replace those who leave?

- Is it evolving?
Change is a two edged sword. New features can make you life easier, but they can also break your old code. Understand how the language evolved and what features lay ahead. Plow trough bug tracking tools to check if bugs are being fixed. Get involved in the community. Contribute..

- How good are the supporting tools?
There always hardcore hackers who prefer using a bare bones editor, but a good development environment make a huge difference. Can you debug/profile/refactor(...) easily?

- How much does it cost?
I'm not talking about the language alone. You may download it for free, but you may have to pay for the required hardware, operating systems, human resources, licenses (...)

- Will the market accept it?
For technical or political reasons a customer may refuse your project. Listen to your customers. What are the industry trends? Are you buzzword compliant?

- For how long is the project being supported?
One month? One Year? As long as the sun burns?
Technology changes a lot. Software does rot. As far as I know no one can predict the future, but you can always make an educated guess.
Depending on the answer, you have to ask the same questions, not only about your project but also about your development environment (language, operating system, development tools,...) :-)

So, you're the one in charge. Which language are you going to choose?

On interfaces

A concrete class may be seen as a comprehensive way of describing a set of objects sharing certain characteristics. These objects are said to be their instances.

An abstract class also describes comprehensively a set of objects. However, the objects in this set are not described in full. Rather, their description in full is always provided by one of the concrete classes which specialize the abstract class. Abstract classes, hence, correspond in terms of objects to the unions of the objects which are possible instances of their concrete specializations. Since it is always possible to provide a further concrete specialization of an abstract class, describing a whole new set of possible instances, abstract classes are, in a sense, open.

Classes, whether abstract or not, are always descriptions of objects. Hence, their identifiers are usually nouns or small noun phrases. E.g., Vehicle (abstract class) and Car (depending on the context, may either be concrete or abstract).

Interfaces should not be confused with abstract classes, though the lack of interfaces in some languages forces programmers to use abstract classes as an alternative. Interfaces represent abilities which objects may have. The abilities are supposed to be related to behaviour. Having a certain ability is usually insufficient to fully describe a possible, existing, object. Hence, interfaces usually represent partial abilities of objects. Objects may have multiple abilities. Since existing objects are always instances of concrete classes, their possession of certain abilities must be declared at the class level. Hence, classes implement interfaces. Multiple classes may implement the same interface, and multiple interfaces may be implemented by a single class.

Class specialization, or extension, may be seen as a two-way process. Firstly, the specialization implements the behaviour of the more general class, i.e., the more general class is seen merelly as an interface, whose abilities the specialization declares to provide. Secondly, the specialization inherits the implementation available in the more general class (which may or may not be complete, according to whether it is concrete or not - though specialization of concrete classes is not recomended).

It is the implementation of behaviour which provides the notion of sub-typing. Sub-typing allows dynamic polymorphism, which is thus (in a way, just) possible through interfaces. Sub-classing is sub-typing plus implementation sharing. In Java multiple sub-typing is possible, but multiple sub-classing is not.

Since interfaces declare behaviour, their identifiers should be adjectives (or adjective phrases) representing abilities. In the english language most of these adjectives end with the suffixes "-able" or "-ible", from the latin "-abilis", meaning "capable of". E.g., the classics Comparable or Clonable.

If your best choice for the identifier of some classe or interface does not fit with the above recomendations, then think again about your modelling: it is most likelly wrong.

Finally, yes. It is true. The name "interface" is not very good. One should say that a class "provides" a certain "behaviour" instead of simply saying that it "implements" a certain "interface". Even if the lack of support for programming by contract makes the explicit contracting of behaviour impossible, forcing the compiler to merely check whether... interfaces are implemented. Sigh...

From: Rambling about Java

Tuesday, October 04, 2005

O legado

C++.
A minha linguagem de eleição, infelizmente vítima de inúmeros mitos e preconceitos nos dias que passam.

Certamente um dos mais infundados e, na minha opinião, injustos é o de que o C++ é uma linguagem ultrapassada e antiquada.

A minha humilde opinião : Embora o standard tenha estado parado durante alguns anos, os usos da linguagem e as suas boas práticas alteraram-se radicalmente durante o tempo de vida da mesma.
O C++ é frequentemente comparado ao Java e não menos frequentemente considerado menos "avançado". Há que notar uma grande diferença aqui. Java foi criado com uma filosofia quase oposta à do C++. Enquanto que o primeiro tenta esconder as fontes de erros do programador, o C++ deixa o poder (e as responsabilidades) nas mãos de quem o usa. Isto certamente torna C++ uma linguagem com uma curva de aprendizagem menos amena do que outras linguagens mais simples, mas nunca o tornará menos "avançado". Até pelo contrário...

Não conheco ninguém, que depois de compreender o que é possível fazer, e já foi feito, em projectos como a famosa STL, as libs do Boost ou ainda na lib Loki por indivíduos brilhantes, que considero como hackers dignos deste nome nos dias que correm, continue a olhar para C++ com os mesmo olhos de desdém ou até de medo...

O meu medo é do ímpeto que as linguagens concorrentes possam adquirir, atraindo cada vez mais novos talentos e esforço para as suas causa, deixando assim o C++ depleto do que faz qualquer linguagem vingar. O apoio, a escolha e o uso alargado por parte dos programadores.

E este preconceito vem de onde ?
Em grande parte julgo vir do forte legado que C++ tem. A sua luta para manter um alto nível de compatilidade com C.
O que começou por ser um ponto forte para a adopção da linguagem é, actualmente e na minha opinião, a raíz de muitos problemas com código C++ (e, irónicamente, também o que possibilitou uma fácil adopção progressiva do C++ por parte de programadores da velha guarda).
Ao usar o C++ como apenas um C com classes (ou seja, o que C++ já foi, lá nos princípios, na década de 80), acabamos por ficar com o que eu gosto de chamar de código C mais ou menos ...
É este tipo de código que, apesar de ser código válido, vai dar o mau aspecto que C++ parece ter aos olhos de quem ainda não o abraçou de alma e coração.