Week 3: The Tale of lsoda Driver Function

Hi readers,
Today, I’m feeling the calm after a storm—serene, but only after some serious turbulence. And yes, I have some great news to share.
As you know, this week I focused on implementing the  lsoda driver function. But what I didn’t expect was the wild debugging journey it would take me on. So...
Fasten your seatbelts—and lets dive right into it.

The Plan: Refactor and Implement 

I started the week by refactoring the lsoda() driver function block by block, and designing three supporting methods:
  • allocMemory()
  • lsodaFree()
  • lsodaPrepare()

For allocMemory(), after much thinking, I decided to use a constructor-like approach to initialize all necessary LSODACommon variables. This function is called via lsodaPrepare() at the beginning of any solve operation, with the required arguments.
For lsodaFree()—thanks to the JVM's garbage collector, I didn’t need to do anything explicit. Yay, Java!

Feeling optimistic, I wrote two integration tests: a linear ODE system and one directly from the libLSODA test suite.

I ran the tests... and bam. Infinite loop. No output. 😓

The First Bug: Return Does Not Mean Return

The got this as output in the infinite loop:
[lsoda] 500 steps taken before reaching tout.
This message comes from a softFailure() call in block e.—after which the algorithm is supposed to exit. But it wasn’t!

The issue? In the original C library, softFailure() and other return statements are macros, so a call like softFailure(); acts like a direct return. But in Java, we’ve implemented them as functions—so they don’t return control unless explicitly told to. So, fixed it by changing softFailure(); to return softFailure(); And did the same thing with other return method calls, hardFailure(), intdyReturn() and successReturn().

Bug #2: Where Did My Result Go?

While fixing that, I found another very subtle issue. The result array y[], which is passed as input to lsoda with initial values, wasn’t being updated. Why?

In the original algorithm, y is updated in-place after reducing its pointer by one unit using pointer based array manipulation to simulate the 1-based indexing in Fortran. But in our Java implementation, We had created a new double[n+1] array that handled indexing correctly—but we had forgot to reflect the changes back to the original y (however this is not possible because the size of y is less than the new array).

To fix this, I 

  • Created a private yOffset array in LSODAIntegrator
  • Used it to store the y array copy initially and simulate the 1-based indexing and perform all the operations afterwards
  • Retrieved results from it using its getter method, getY() in the tests

(We couldn’t just return the final y because lsoda() return type is integer indicating status—success, fail, etc.)

Still, the question remained: Why was softFailure() being hit in the first place?

Debugging Continues

This is where things got intense. The flow should not be going to softFailure() in these tests. So began days of debugging—using a combination of dry runs, program slicing, cause elimination and an a group of print statements.
(Yes, I’m still a believer in System.out.println() debugging. Don’t judge—it works. 😅)

Eventually, I traced the issue to the stoda() function (another 350 lines of code to find the diversion). And then after crawling through nearly every method in LSODAIntegrator, I found the root: it was in how the nordsieck array yh was being populated in stoda. We missed a plus sign there! 🤡 I fixed it, reran the tests, and the linear system test passed. Victory? Not yet.

Still Failing

The linear system test passed but the other test from libLSODA didn't passed yet. To isolate the bug, I added a new test with an exponential ODE system—and it failed too. So back I went, gathered all my print statement warriors, diving through ~2000 lines of code in the LSODA package again.

After much effort, I found the bug in cfode():

  • Wrong index (i instead of 1)
  • Incorrect variable usage (agamq swapped with ragq)

Fixed both, ran the test again... Exponential test passed!

Accuracy Problems: Round 4 begins

But something still felt off—the accuracy was poor. Results were correct only up to 3 decimal places, and in one case, just 1. It means, something is still off. So in a very calm and composed state, I started the search again for the fourth time. And after a couple of debugging sessions, I found the issue.
Again, back to stoda(). 🤡

Again, the yh nordsieck array was to blame. In the failure path of stoda, it’s supposed to restore the previous time step solution. But the loop restoring yh interchanged i and i1 index variables. I fixed that, ran all tests, and finally…

ALL TESTS PASSED. No errors. Accurate results. Yes, after days of grinding, using all my debugging knowledge and hours of beating my brain, I finally managed to get it done. Honestly, I did expect it to be tough as mentioned in my previous blog, but it went a step ahead. Still, very happy with the results! I added a few more tests afterwards, and they passed too.

Debugging Tips I learned (and You Might, Too)

  • Understand what’s supposed to happen before assuming what’s wrong.
  • Print statements work. (Not that I'm promoting them) Use them wisely, and surgically.
  • Track control flow carefully—especially in translated codes
  • When in doubt, write a minimal test case and reduce complexity.
  • And most important, be patient.

Meeting with Mentors

  • I had a meeting with my mentors on 19 June 2025, where Dr. Funahashi and Arthur reviewed my last PR, discussed a few technical queries and approved it.
  • Actually, the meeting was just 2 hours after I fixed all this. And I was really calm and relaxed at that time. I shared this progress with my mentors and they were happy too.
  • I haven't open a pull request for this yet. However, you can track all these in the lsoda branch of my SBSCL fork

What's Next

Now that the driver function is working, my next focus is:

  • Testing it even more rigorously

  • and work on the integration mechanism, for which I need to learn more about SBML, and closely read the SBMLInterpreter, AbstractDESolver, DESolver and previous solvers.

Till next time,
Bye! 👋

Comments

Popular posts from this blog

Week 7: Events, Exhaustion and Evaluations

My GSoC 2025 Journey Begins

Week 1: Coding Period Begins