The Productive Programmer

Friday, September 18, 2009

Today I read The Productive Programmer .

I’ve already got a bunch of books piled up and waiting to be read, some of which I’ve reached the middle of and some of which I’ve read only the introduction to. I was bored today and this looked like an easy read that I could drop at any time. It is an easy read, but the fact is, it’s very catchy. It draws you in and doesn’t want to be let go. It’s a great way to spend half a day.

This could have been another one of those “94 things you need to know” books, but I think this title is way better. The tips are divided into 2 big sections and a few separate chapters within each one, giving the book some structure. One of the strange things about it is that its advice spans across 3 different platforms (*nix, OS X and Windows) and they’re all mingled together in the same chapter. I was put off by this at first, but after I’d gotten into the book a bit I realized that it is, in fact, a good idea. The whole theme of the book is programmer productivity and the reason it has this title and not a cheesy title with numbers like “94.3 productivity tips for programmers” is that the tips aren’t what’s important. The book is there to hit people in the head with and open their minds. You aren’t supposed to just use the tips provided, that’s unimportant. What’s important is that you have a shock and realize that you’ve been the wrong kind of lazy as a programmer. You’ve stopped automating, you’ve become a machine; in the author’s own words, computers have started “getting together late at night to make fun of their users”. As soon as I realized what this book was really about, I started reading the Windows tips as well. I also stopped a few times to look in my distro’s repository for tools that I had known of, but never used before, only now understanding their true purpose. There are many applications that seem just trendy at first, until you realize that even a small productivity boost is a big productivity boost. (Go check out gnome-do , I’d heard about it years ago, but never tried it until now).
The book contains tips ranging from application launchers, Firefox’s magic address bar, bash scripting commands to office productivity tips for killing distractions. Once again, the mindset is important, not the tips themselves. The big take-away from this book is beginning to constantly judge everything you do as a programmer. This isn’t new advice (at least for those who have read The Pragmatic Programmer”, which by the way is mentioned several times in this book), but I find it’s better emphasized in this book. At one point in the book, the author explains how it took his team one hour to devise a Ruby script to automate some simple task that would’ve required 10 minutes if done manually and finally only needed doing 5 times. One could say there was a loss in productivity, but as the author points out, one would be wrong. Those 50 minutes would’ve been spent with the brain turned off, whereas the hour writing the script was spent learning, focusing, practicing, gaining knowledge that can later be used on a different project. Some of us would probably have gotten bored in those 50 minutes and fallen into procrastination. That doesn’t happen when you’ve got a complex problem to solve.
That was the first part of the book, Mechanics. The second part, Practice is a bit harder to read, as it’s not just disparaged tips on very different applications. They’re quite two separate books actually. The majority of the examples in Part IIare java (they’re mostly readable even for someone who doesn’t speak the exact dialect of OOP that java uses) and this part is mostly about software construction as Steve McConnell would say, but it’s also about Java. I learned a new acronym: YAGNI which basically means that thinking ahead is bad. This is probably one of the advices that I feel the least guilty about, but which I sometime observe in people around me. Never program a feature you don’t urgently have a need for.
One of the good points of this book is the originality with which the ideas are expressed. Most of these ideas aren’t new, especially to anyone who’s read other software engineering books. The text is spiced up with little narratives of different adventures from the author’s experience as a consultant and there is also an Ancient Philosophers chapter and an explanation of the PacMan game’s way of functioning (although I didn’t understand how that’s supposed to make the game less enjoyable).

View Comments post separator

GSOC - it begins...

Thursday, April 23, 2009

My Fedora proposal got accepted to this year’s Google Summer of Code Program. You can look at a short abstract here . Now I’m going to try to explain what this project is about and what I did to prepare for being accepted, hopefully without going mad about how happy I am about it.

I started work on the Fedora Project almost a year ago. One day I popped on the mailing list and then on the irc channel of the infrastructure team and asked for something to do. Luckily, Toshio Kuratomi was on the watch and after giving me a short tour of the various projects he could help me get familiar with, I picked the package database. Most of the work I’ve done so far is in the pkgdb (the search capability is the most obvious thing I worked on). The overview on the front page describes it quite well; it’s got package information and it’s aimed at package developers. It’s not a very famous part of the fedora project websites, certainly not as famous as something like packages.ubuntu.com is for ubuntu. But that’s not what it was intended for, even if that’s what attracted me to the project at first. I liked the exposure of such a website, but also the fact that, at the time, it was easier for me to understand what it did and how it worked :).

The idea of making the package database more user-friendly as opposed to developer-centric wasn’t a new one. Toshio, the main developer had been thinking about it for a long time, but I guess it never really became a priority. The idea had also been proposed for last year’s GSOC, but it hadn’t been accepted (this scared me a bit when I found out). I picked this idea on a whim when I told Toshio I wanted to participate in this year’s GSOC on pkgdb and he asked me what exactly I wanted to do. I wasn’t expecting the question, so I answered with the first thing that came to mind. Looking back, I think it was a good choice.

All my involvement with the Fedora Project owes a lot to the best possible person who could have become my mentor for GSOC. The Infrastructure Team is a great one to work with, and the Fedora contributor community is made up of a lot of smart, fun and selfless people. I say this after having spent a lot of time lurking the IRC channels, the various mailing lists, the planet etc. and to a somewhat lesser extent interacting with other contributors. However, I wouldn’t have continued contributing if it weren’t for the continuous support and guidance of Toshio. I probably wouldn’t have been able to participate in the GSOC without the many discussions (starting in February) with Toshio about the proposal and the support when explaining the idea to other community members. Having said that, I think that being familiar with the pkgdb also helped a lot with writing the proposal. I didn’t have to waste time on getting to know the code, the community, the devs as I would have if I had written a proposal for a different project. I also had a fair idea of what would constitute a good proposal and a rough idea about how it could be implemented. I think this helped with my credibility in the eyes of the mentors who ranked my proposal.

I was never convinced I would get a spot on Fedora & JBoss’s accepted proposal’s list , but it was is a great thing to dream of. The butterflies in my stomach were killing me at the end of the waiting period, especially since it had lasted for more than 2 months. I now have a summer to work full time on my hobby :).

At the end of the summer, the fedora community will hopefully have a package database with package versions, size, dependencies, rss feeds, tagging, package reviews etc. There’s even a detailed schedule from my proposal you can drool on if you’re so inclined.

And hello, fedora planet! Sorry for being late.

View Comments post separator

testing a django blog's models

Thursday, September 25, 2008

This post is a continuation of this post and I’ll be using that schema to write tests on top of. Here it is, for easy reference:

from django.db import models
class Category(models.Model):
    nume = models.CharField(max_length=20)
class Post(models.Model):
    title = models.CharField(max_length=50)
    body = models.TextField()
    category = models.ForeignKey(Category)
    published = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)
    modified_time = models.DateTimeField(auto_now=True)
class Commentator(models.Model):
    name = models.CharField(max_length=50, unique=True)
    email = models.EmailField(max_length=50, unique=True)
    website = models.URLField(verify_exists=True)
class Comment(models.Model):
    body = models.TextField()
    post = models.ForeignKey(Post)
    author = models.ForeignKey(Commentator)
    approved = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)

Ok, so here’s what we’re testing: our model — the emphasis is on our, because we’re only testing our code. And mostly, we’re actually testing that the code in models.py corresponds with what’s in the database. All test methods must begin with the word test.

One of the most annoying things which took me a while to figure out was that the setUp method is run every time before one of the other methods is run. That means that if you want to test for uniqueness, you have to build a tearDown method if you want to run any other independent tests. This is why snippet A won’t work, but snippet B will.
Here’s the model:

class Category(models.Model):
    name = models.CharField(max_length=20, unique=True)

snippet A

class CategoryTest(unittest.TestCase):
def setUp(self):
    self.cat1 = Category.objects.create(name="cat1")
def testexist(self):
    # make sure they get to the database
    self.assertEquals(self.cat1.name, "cat1")
def testunique(self): 
    self.assertRaises(IntegrityError, Category.objects.create, name="cat1")

snippet B

class CategoryTest(unittest.TestCase):
def setUp(self):
    self.cat1 = Category.objects.create(name="cat1")
def testexist(self):
    # make sure they get to the database
    self.assertEquals(self.cat1.name, "cat1")
    self.assertRaises(IntegrityError, Category.objects.create, name="cat1")

The second snippet only calls the setUp method once because there is only one other method. But that’s not very nice. Ideally we’d to be able to run each test individually, so maybe we can write a tearDown method to be run after each other method, to restore the database.

However, there is an easier way to not have to write a tearDown method and that is using the django.test module which is an extention to unittest. All you have to do is import django.test instead of unittest and make every test object a sublclass of django.test.TestCase instead of unittest.TestCase.
Here is what it looks like now:

class CategoryTest(django.test.TestCase):
    def setUp(self):
        self.cat1 = Category.objects.create(name="cat1")
        self.cat2 = Category.objects.create(name="cat2")
    def testexist(self):
        # make sure they get to the database
        self.assertEquals(self.cat1.name, "cat1")
        self.assertEquals(self.cat2.name, "cat2")
    def testunique(self):
        self.assertRaises(IntegrityError, Category.objects.create, name="cat1")

Now, let’s test the Post class:

class Post(models.Model):
    title = models.CharField(max_length=50)
    body = models.TextField()
    category = models.ForeignKey(Category)
    published = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)
    modified_time = models.DateTimeField(auto_now=True)

There’s a bunch more stuff to test here, like the fact that everything gets to the database (title, body, category) and that everything has it’s right type/class.
We setUp a post, but also a category, since the test will be independent, but needs a Category to generate a Post.

class PostTest(django.test.TestCase):
    def setUp(self):
        self.cat1 = Category.objects.create(name="cat1")
        self.post1 = Post.objects.create(title="name",body="trala lala",
                category=Category.objects.all()[0])

Next, we need to do a bit of a trivial test to check that the title, the body and the right category get to the db

def testtrivial(self):
        self.assertEquals(self.post1.title, "name")
        self.assertEquals(self.post1.body, "trala lala")
        self.assertEquals(self.post1.category, Category.objects.all()[0])

I think this is a good way to test that the creation_time and modified_time are newly generated datetime.datetime objects:

def testtime(self):
    self.assertEquals(self.post1.creation_time.hour, datetime.now().hour)

No, wait. I think this looks a bit more professional:

def testtime(self):
        delta = datetime.now() - self.post1.creation_time
        self.assert_(delta.seconds < 10)
        delta_modified = datetime.now() - self.post1.modified_time
        self.assert_(delta_modified.seconds < 10)

So now, we’re looking for datetime objects that were generated less than 10 seconds ago. That’s really very generous since the time it takes to run the test from the time the setUp method is run is in the range of microseconds.
This test doesn’t show the true difference between modified and creation time. Modification time is changed every time the object is saved to the database while creation time is not. So let’s write a new test based on that knowledge:

 def testModifiedVsCreation(self):
        modified = self.post1.modified_time
        created = self.post1.creation_time
        self.post1.save()
        self.assertNotEqual(modified, self.post1.modified_time)
        self.assertEqual(created, self.post1.creation_time)

Testing for a boolean value is really easy:

 def testpublished(self): 
        self.assertEquals(self.post1.published, False)

And then there’s more than one way I can think of to test the Category ForeignKey:

def testcategory(self):
        self.assertEquals(self.cat1.__class__, self.post1.category.__class__)
        self.assertRaises(ValueError, Post.objects.create, name="name",
                body="tralaalal", category="ooopsie!")

In the end, I’ll go for the more general one, even though the second one is more excentric. So:

def testcategory(self):
        self.assertEquals(self.cat1.__class__, self.post1.category.__class__)
        self.assertRaises(ValueError, Post.objects.create, name="name",
                body="tralaalal", category="ooopsie!")

Btw, if you don’t know the errors (like ValueError — I didn’t know it), you can always drop to a manage.py console and try to Post.object.create(name="name",body="tralaalal", category="ooopsie!") and see if you get lucky.

Ok, passing on to the Commentator class:

class Commentator(models.Model):
    name = models.CharField(max_length=50, unique=True)
    email = models.EmailField(max_length=50, unique=True)
    website = models.URLField(verify_exists=True, blank=True)

We’re only going to test that the data gets to the database and that the name and email fields are unique. At this stage we can’t test the validation of the email and website fields. We’ll be doing that later, when we write the forms.
This should seem trivial by now:

class CommentatorTest(django.test.TestCase):
    def setUp(self):
        self.comtor = Commentator.objects.create(name="hacketyhack",
                email="hackety@example.com", website="example.com")
    def testExist(self):
        self.assertEquals(self.comtor.name, "hacketyhack")
        self.assertEquals(self.comtor.email, "hackety@example.com")
        self.assertEquals(self.comtor.website, "example.com")
     def testUnique(self):
        self.assertRaises(IntegrityError, Commentator.objects.create, 
                name="hacketyhack", email="new@example.com", 
                website="example.com")
        self.assertRaises(IntegrityError, Commentator.objects.create,
                name="nothackety", email="hackety@example.com",
                website="example.com")

Now, let’s get to testing the Comment class:

class Comment(models.Model):
    body = models.TextField()
    post = models.ForeignKey(Post)
    author = models.ForeignKey(Commentator)
    approved = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)

There won’t be anything new here. And this is when and why testing is boring. But, hey! A man’s gotta do, what a man’s gotta do.

class CommentTest(django.test.TestCase):
    def setUp(self):
        self.cat = Category.objects.create(name="cat1")
        self.post = Post.objects.create(title="name",body="trala lala",
                category=Category.objects.all()[0])
        self.comtor = Commentator.objects.create(name="hacketyhack",
                email="hackety@example.com", website="example.com")
        self.com = Comment.objects.create(body="If the implementation is 
        easy to explain, it may be a good idea.", 
        post=Post.objects.all()[0], author=Commentator.objects.all()[0])
    def testExist(self):
        self.assertEquals(self.com.body, "If the implementation is 
        easy to explain, it may be a good idea.")
        self.assertEquals(self.com.post, Post.objects.all()[0])
        self.assertEquals(self.com.author, Commentator.objects.all()[0])
        self.assertEquals(self.com.approved, False)
    def testTime(self):
        delta_creation = datetime.now() - self.comm.creation_time
        self.assert_(delta_creation.seconds < 7)
    def testCreationTime(self):
        # what if it's a modification_time instead?
        created = self.com.creation_time
        self.com.save()
        self.assertEqual(created, self.com.creation_time)

Now that we’ve written all the tests we have to make sure that they’re run against the actual database. Or better yet, a backup copy of it. Otherwise, the tests are useless, since django creates a new database based on the schema defined in models.py every time models.py test is run.

First, you’ll need to make a copy of django.test.simple (put it in your project’s directory for example). Then comment these lines:

# old_name = settings.DATABASE_NAME
# from django.db import connection
# connection.creation.create_test_db(verbosity, autoclobber=not interactive)
result = unittest.TextTestRunner(verbosity=verbosity).run(suite)
# connection.creation.destroy_test_db(old_name, verbosity)

And now, add this to your settings.py file:

TEST_RUNNER = 'myproject.simple.run_tests'

Be careful now. All the data in your database will be lost when you run manage.py test the next time. So back it up! First create a new database, say backup and then:

mysqldump -u DB_USER --password=DB_PASS DB_NAME|mysql -u DB_USER --password=DB_PASSWD -h localhost backup

You can reverse that when you’re done.

Here’s to show that it works (after I’ve made a little modification to the model, but not the database):

$ python manage.py test
..EEE..EEEEEE................
--> lots of tracebacks <--
----------------------------------------------------------------------
Ran 29 tests in 10.149s
FAILED (errors=9)

Ok, so that should provide a pretty good test coverage for now. Let’s go get breakfast!

View Comments post separator

blog database schema cu capsuni - Part 2

Monday, September 22, 2008

Tocmai am reușit (am găsit timp — furat timp) să scriu în django schema din postul trecut. Simplicity is divine:

from django.db import models
class Category(models.Model):
    nume = models.CharField(max_length=20)
class Post(models.Model):
    title = models.CharField(max_length=50)
    body = models.TextField()
    category = models.ForeignKey(Category)
    published = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)
class Commentator(models.Model):
    name = models.CharField(max_length=50, unique=True)
    email = models.EmailField(max_length=50, unique=True)
    website = models.URLField(verify_exists=True)
class Comment(models.Model):
    body = models.TextField()
    post = models.ForeignKey(Post)
    author = models.ForeignKey(Commentator)
    approved = models.BooleanField()
    modified_time = models.DateTimeField(auto_now=True)

Pe lângă faptul că toate tipurile de date au nume și explicații pe care le poate înțelege oricine, django va folosi datele astea atunci când va construi interfața de administrare.
E interesant că trebuie să declari toate tabelele în ordine. La început pusesem Categoria ultima și n-o găsea când vroia să facă ForeignKey-ul de la Post. M-a cam răsfățat OOP-ul.
Alt lucru fain e că de pe acum se pregătesc feature-uri interesante cum ar fi URLField.verify_exists care verifică toate url-urile introduse și le refuză dacă primește 404. Așa că de-acum n-o să mai poată nimeni să pună variabile metasintactice de genu: caca, mumu și altele în field-ul ăla!

Și acum un mysql describe3 pentru tabelele rezultate:

mysql> describe revolution.blahg_category;
 +-------+-------------+------+-----+---------+
 | Field | Type        | Null | Key | Default |
 +-------+-------------+------+-----+---------+
 | id    | int(11)     | NO   | PRI | NULL    |
 | nume  | varchar(20) | NO   |     | NULL    |
 +-------+-------------+------+-----+---------+
 mysql> describe revolution.blahg_post;
 +---------------+-------------+------+-----+---------+
 | Field         | Type        | Null | Key | Default |
 +---------------+-------------+------+-----+---------+
 | id            | int(11)     | NO   | PRI | NULL    
 | title         | varchar(50) | NO   |     | NULL    |
 | body          | longtext    | NO   |     | NULL    |
 | category_id   | int(11)     | NO   | MUL | NULL    |
 | published     | tinyint(1)  | NO   |     | NULL    |
 | creation_time | datetime    | NO   |     | NULL    |
 | modified_time | datetime    | NO   |     | NULL    |
 +---------------+-------------+------+-----+---------+
 mysql> describe revolution.blahg_commentator;
 +---------+--------------+------+-----+---------+
 | Field   | Type         | Null | Key | Default | 
 +---------+--------------+------+-----+---------+
 | id      | int(11)      | NO   | PRI | NULL    |
 | name    | varchar(50)  | NO   | UNI | NULL    |
 | email   | varchar(50)  | NO   | UNI | NULL    |
 | website | varchar(200) | NO   |     | NULL    |
 +---------+--------------+------+-----+---------+
 mysql> describe revolution.blahg_comment;
 +---------------+------------+------+-----+---------+
 | Field         | Type       | Null | Key | Default |
 +---------------+------------+------+-----+---------+
 | id            | int(11)    | NO   | PRI | NULL    | 
 | body          | longtext   | NO   |     | NULL    |
 | post_id       | int(11)    | NO   | MUL | NULL    | 
 | author_id     | int(11)    | NO   | MUL | NULL    |
 | approved      | tinyint(1) | NO   |     | NULL    |
 | modified_time | datetime   | NO   |     | NULL    |
 +---------------+------------+------+-----+---------+

TADA!
Trebuie să studiez de ce se strică formatarea (de fapt știu de ce, trebuie să scot textile din blockul ăla), dar ideea de bază e clară.

3merci gheorghe!

View Comments post separator

blog database schema cu capsuni

Tuesday, September 16, 2008

Am reușit să-mi scriu schema bazei de date a viitorului meu blog. Ah, da, m-am apucat de treabă. O să folosesc django cred (și deja mi s-a spus că era previzibil) deși încă mai am timp să mă răzgândesc. N-am găsit în 2 minute un script care să-mi deseneze scheme, așa că pun relațiile în engleză aici. Come bash me!

Post

  • belongs_to Category
  • has_many Comments

Comment

  • has_one Commentator
  • belongs_to Post

Commentator

  • has_many Comments

Category

  • has_many Posts

Ia să încerc să fac și niște tabele din ce scrie mai sus.

Post
id || title || body || category_id || created_at || published

Comment
id || post_id || commentator_id || body || approved || created_at

Commentator
id || name || email || website || gravatar_url

Category
id || name

Am renunțat la tabele și am improvizat o formatare. Sper să fie lizibil.

E evident ceea ce nu am făcut sper. Nu am lăsat commentatorii cu commenturile lor ceea ce ar fi dus la o relație cu 4 coloane redundante (nume, email, website, gravatar):

Comment
id || autor || email || website || gravatar_url || post_id || commentator_id || body || approved || created_at

Pe parcurs o să mai adaug rating la posturi și alte lucruri care mai îmi vin în minte. Ratingul o să încerce să fie ceva complex cu sus/jos, dar asta mai târziu.
Deci? Ce părere aveți? Ce să mai adaug? Am greșit ceva? Mă încadrez în forma normală 5? :-)

View Comments post separator
Powered by pyblee