Finding and fixing DebugInfo related bugs

2018/06/29

The last few weeks that I haven't posted any updates have been pretty laborsome.

For one, I had finals in the university, and all of the five classes I was being tested on were on the same week. This was both good and bad. Good for GSoC because it didn't consume much of the month, but OTOH bad for my performance because I was giving one exam after the other.

So far the results came for 2 classes and I passed them both, so that's good!

On the LLVM side of things, I was struggling with an InstCombine bug in debug intrinsics loss. It took way more time than anticipated to figure out what was going on, and how we should fix it. All this time wasn't wasted though, since I gained valuable experiences!

Let's talk about those now.

Finding bugs

Greg has created a meta bug to group debugify failures across all passes and have a centralized reference point.

I've already made a post describing a way to find such bugs. Although Greg's process seems to be a bit different, since it starts from the source instead of relying on already crated IR.

Zeroing to the problem

First, one has to understand at least what the description of the culprit pass is, and how it's used to optimize code.

Then in the relatively more simple case of missing Debug Location, it's a matter of running opt with the culprit pass through a debugger, stopping whenever the instruction with the missing DL is created and reading through the backtrace to find when and where in the code the DL is dropped. Usually the fix is a pretty simple one like on r335904 and D48769.

In the case of a missing variable though it's more complex. In the aforementioned InstCombine bug for example the fix is not clear at all, since the proper API is not in place yet. Vedant is tackling this by first creating a utility and currently enhancing it.

Testing that it works

Another important thing after fixing those bugs, is adding regression tests to make sure they won't appear again. To that end the -debugify pass is especially helpful. (NOTE: -debugify is the pass that applies Debug Info to every instruction on a module, -check-debugify is the one that actually does the checks for DI preservation, after a pass has run)

Most of the above patches contain this new kinds of test. Basically we try to incorporate a RUN line like opt -debugify -culprit-pass, to the existing tests and add checks for the preservation of DI.

An important part of the -debugify pass is that it should be stable enough to enable it's usage on regression tests.

Finding the test files to modify

Another cool trick Vedant taught me is the usage of assertions to check whether a sample input IR is passing through the code we juts modified.

For example in r335904 and D48769, after I found where in the code the DL was dropped, using lldb and the sample code from Greg's report, I had to find an existing test that went through the same code path as Greg's sample code.

In the case of the LoopVectorize pass, there are a lot of tests to look through. The following assertion trick makes the process as simple as it could ever get.

First I add a failing assertion next to the code I modified:

NewAI->setDebugLoc(AI.getDebugLoc());
assert(false && "gotcha");

Then I modify the bash loop from here to just run the pass over all the tests.

for i in $(ls ~/code/llvm/test/Culprit/Pass/*.ll -1); do 
    echo -e $i":\n-------"; 
    opt -cuplrit-pass -disable-output $i;
    echo -e "-------\n"; 
done > results 2>&1

Then reading through the resulting file, you can clearly see in what test file there was an assertion, and that means there is a test there that can be used to check the DI preservation.

The assertion backtrace also shows in what IR function the assertion was triggered so it's easy to find what function to add the DI preservation tests at, even if the file has many different tests.