[ckan-discuss] Test speed, sqlite and sqlalchemy
rufus.pollock at okfn.org
Tue Dec 28 10:35:50 GMT 2010
On 26 December 2010 22:57, David Raznick <kindly at gmail.com> wrote:
> I am new to this project, so Hello. I am very interested in the concept
> behind it and decided to see what my expertise could bring to it. I know
> database performance and sqlalchemy very well. I have even helped out a
> little with the performance of sqlalchemy.
Wow, David, this is fantastic! BTW: would you like to be added to our
> The first thing I noticed were the slow tests. I wanted to look the tests to
> learn how the project worked. I was also concerned about the old sqlalchemy
> (0.4). So I decided to see what I could do...
> I have managed to get the tests down to running in about 9 mins on my oldish
> This was done by:
> * a patch I submitted on bug 868, which stopped the continuing dropping and
> creating of tables as this has a large overhead. Just simply deleting
> everything in all the tables is much faster.
Great, I'd wondered about this myself and it's great you've done it --
and very nicely with the table iterators -- that's a trick I'll learn
> * turning off durability in postgres. You do not need durability in
> testing. I even tried putting the database in a ramfs but I decided the
> complication was not worth it, the speed up was not significant.
> (http://www.postgresql.org/docs/9.0/static/non-durability.html) works for
> 8.x too.
Another great tip.
> * upgrading to sqlalchemy 0.5.7 (this gives a good 15-20% speed up).
> * sorting out some really slow bits in the tests. There where almost 3
> mins of unessential sleeping/timeouts.
> The trickiest bit was the upgrade to 0.5. I have all the tests working
> apart from 2 and I understand why they are not working. They are not
> critical and do not effect the speed. This is for another discussion.
> The functional subdirectory in the tests is by far the slowest (as it should
> be). Taking up 6.5 mins of the 9.
> A speed per test sheet has been posted on ticket 868.
> There has been some discussion on ticket 867
> (http://knowledgeforge.net/ckan/trac/ticket/876) about using and in memory
> sqlite as a backend. In those I am clearly against it. I think the
> complication it adds to the code base is not worth it. I also do not think
> that it will speed things up as much as hoped ...
I definitely agree with you about complication. It would be 'nice' to
support sqlite but no-one is going to run against it in production and
our main reason for doing so was test speed. I'd be happy for us to
drop ticket #876 (perhaps using the efforts done so far).
@Seb (as the owner of ticket 876): what do you think here?
> I have done some profiling on the whole test suite. Only about 110 seconds
> of the 9 mins is spent waiting for the database. So even if you managed to
> get rid of all of that time (which you will not as in memory databases are
> not instantaneous) you may only get a speed up of a third.
> Around 50% of the time is spent in sqlalchemy. So the bottleneck is there.
> I think upgrading to 0.6 may be as fruitful. sqlalchemy is going under
> some big speed improvements
> (http://techspot.zzzeek.org/2010/12/12/a-tale-of-three-profiles/). There
Ok, so 0.5 -> 0.6 looks like about a 20% reduction while 0.5 -> 0.7
looks like a 50% reduction (i.e. doubling in performance)!
> may be some benefit from not going through the orm part for some
> performance critical areas of the code. However, we will need to analyse
> real user database traffic to determine this.
> As a side note, the really good news is that hardly any time is spent in
> ckan code itself.
> I hope you all had a good seasonal period.
> Please get back to me on any details on the above!
We'll look through in more detail but let me reiterate the massive
thank-you for your work and advice here -- really useful and really
More information about the ckan-discuss