Analysis of Java implementations of Fibonacci Heap

Moved to new blog:

This entry was posted in Algorithms, Computing and tagged , . Bookmark the permalink.

14 Responses to Analysis of Java implementations of Fibonacci Heap

  1. david kilmer says:

    Thanks for posting this, Nathan! I’ve been trying to learn about Fibonacci heaps for an A* implementation (in Delphi, alas), and I was studying the nutch class because the code was easy to understand. The allocation of the array in consolidate() had me befuddled, but now it all makes sense. Thanks again.

  2. d3sphil says:

    Thanks for the post!
    This was the key to helping me understand the Fib heap (implemented in Actionscript for an A* algo as well).

    That said, I haven’t tested your Fib heap code in GraphMaker. But I was using it as a reference for my implementation and I ran across a large bug for mine dealing with duplicates. My implementation died on duplicates because in the last step of consolidate() when looping through the root list I was comparing against the previously randomly set min and using a “< ". However, this will not work for a heap of all duplicates. My quick fix was to set min to null before the last loop and then add an extra check for min == null in the loop. However I think you can also do a "<=" to solve the issue. I'm not sure if this problem arises for your code, but looking just at the last part of conslidate() it seems like it might (you are using it though, so maybe not =D). Just thought I would mention this.

  3. Nathan Fiedler says:

    @d3sphil: There doesn’t seem to be any problem with the GraphMaker implementation. The root list will be intact, the loop will terminate normally, and min will be correct, regardless of the presence of duplicate keys. To ensure that is the case, I added three more unit tests that have a mix of all duplicates, duplicates with one smaller key, and duplicates with one larger key. All the tests pass. Thanks.

  4. Travis Wheeler says:

    Thanks for the post, and for making available your well-written code library. I’m making use of it in a piece of software I’m writing, and unless I’m missing something pretty glaring, I’ve found an error that’s pretty easy to fix. As I read it, in the “link” function, if the node “y” happens to be the current “min” (one of the roots of the forest), then after making “y” a child of “x”, the heap will view the new siblings of “y” as the roots of the forest … which means a lot of lost roots/nodes.

    The solution is easy (unless I’m missing something, which is certainly possible). Just add the following line at the beginning of the “link” function:
    if (y==min) min = y.right;

    (I’d have just e-mailed this to you, but I can’t easily find your address).

    Also, for what it’s worth, I tracked this down with the aid of these two simple functions, which allow the contents of a heap to be printed/parsed. (sorry for the dopey formatting)

    Add this function to the heap:
    public void printForest (StringBuffer sb) {
    if (min != null)

    … and this to the Node:
    public void printSubtree (StringBuffer sb) {
    Node w = this;
    do {
    if (w.parent != null)
    sb.append(” (“
    +w.parent.key + “)”);

    if (w.child != null)
    w = w.right;
    } while (w != this);

  5. Nathan Fiedler says:

    @Travis: if you were to call link() from outside of the consolidate() method, that could certainly be the case. However, the consolidate() method is already checking for this case (where ‘start’ is a synonym for ‘min’), so it should be working just fine. Thanks.

  6. Jimmy Lin says:

    Thanks for posting this and sharing your code with the community. I was poking around the Web and saw that you had previously contributed an implementation to JGraphT. Wondering which implementation was more efficient, I ran a simple benchmark: inserting 1m randomly-generated nodes and extracting 500k of them. All experiments done on Windows XP, Core 2 Duo at 2.60 GHz; JVM opts -Xmx1024m. Averaged over 10 trials:

    For the JGraphT implementation, 594ms for insertion, 2701ms for extraction.

    For the GraphMaker implementation, 925ms for insertion, 2298ms for extraction.

    Variance from run to run was very low, so stdev not worth reporting.

    Given that extraction is much more expensive, this is a worthwhile tradeoff. Care to comment?

    BTW, I noticed that you were at the Hadoop Summit… we might have bumped into each other there…

  7. Nathan Fiedler says:

    @Jimmy: Agreed, there’s an advantage there when you consider the cost of both operations. In general the cost of fibonacci heap is more in the memory overhead than anything. Each node has five references, a boolean, an int, and a double which adds up to about 54 bytes. But, that’s the trade off for speed. I’ll use the profiler in NetBeans to see if there are any obvious hot spots that I can address, and see if the performance can be improved. Thanks.

  8. Travis Wheeler says:

    I see now that you’re doing that test in consolidate … but ‘start’ is a synonym for ‘min’ only internal to the consolidate function. In that function, you change start to start.right, but that’s a local variable, and doesn’t change min to min.right.

    If you’d like an example to reproduce the problem of lost nodes that I described above, feel free to e-mail me (just google my name … that’s me)

  9. Nathan Fiedler says:

    Follow-up from conversation with Travis: there was indeed a mistake in the code in which it was possible for a set of nodes to be silently lost. I’ve added his test case to the unit tests and added a single-line change to the consolidate method: min = start; just before the for loop to find the new min value. Thank you Travis!

  10. Travis Wheeler says:

    you bet :)

  11. Nathan Fiedler says:

    Regarding profiling the fibonacci heap implementation, I’ve found that the link() method was very “slow” in that it was inadvertently calling compiler-generated accessor methods many times. These were introduced because the Node inner class has private fields, but all of the field access is done in the outer class. As a result, the compiler generates accessor methods, which introduces overhead, and when called millions of times, severely impacts performance. I’ve moved the link() method to the Node class and now consolidate() is seven times faster. See revision 20 in the GraphMaker repository.

  12. karussell says:

    Thanks for sharing! I’ve added this in my answer at stackoverflow:

  13. karussell says:

    Ah, and regarding the license: is it GPL or CDDL? And what would CDDL mean for commercial software – could I use it there?

  14. karussell says:

    To answer my own questions: CDDL allows you to use the software in commercial products but you will need to open source the code under CDDL (of course not the commercial software itself! difference to GPL!)

Comments are closed.