Goanna Studio 2.1
Goanna Studio 2.0 has been a great hit, we have gotten a lot of positive feedback and we have also acted on very nearly all suggestions and bug reports to produce a shiny new Goanna Studio update 2.1. This means that a great many false positives have been eliminated, greater accuracy has been achieved and we have also fixed some bugs and there is even a performance improvement or two.
The fixed false positives and changes to checks include:
- ATH-sizeof-by-sizeof - a false positive involving array sizeofs has been fixed.
- FPT-cmp-null - a false negative when warning about using a function pointer directly in a condition, not using a condition operator.
- RED-unused-var-all now considers sizeof(x) to be a use of x.
- MEM-stack-global, MEM-stack-param and MEM-stack-param-ref now take into account re-assignments for globals or parameters
- SPC-uninit-struct now considers a struct with a field which is an array to be accessible without warning.
- ATH-cmp-unsigned-pos and ATH-cmp-unsigned-neg take into account comparisons with (unsigned)-1
- MEM-free-some doesn’t warn when you check the result of a malloc and exit (or return) if it is invalid
- RED-dead no longer warns about goto’s and breaks
- PTR-null-const-pos doesn’t warn about string literals
- EXP-null-stmt no longer warns about a non-empty else block or when there is an assignment or function call in the condition
- SEM-nonconst-call has been renamed to SEM-const-call
- SEM-global-write has been renamed to SEM-pure-global
- SEM-impure-call has been renamed to SEM-pure-call
- ATH-shift-bounds now warns when you shift by 64
- COP-assign-op-ret now warns about assignment operators that return a non-const reference to this
The Goanna Studio for Visualstudio also has some improvements and bug fixes:
- Support for additional options and CL environment variable support.
- Fixed bug involving a non Win32 or Win64 build target and the verbose flag
- Support for environmnent variables with non-alphabetic characters
- ProjectDir and SolutionDir macros now expand with a trailing backslash
We’ve also worked hard to squeeze even more performance where we can, and overall the Goanna analysis engine now scales better with the number of checks that are being used, and Interprocedural analysis is now slightly faster.
Other minor improvements and bug fixes are:
- Better handling of types that have been typedef’ed
- Better handling of implicit this parameter in member functions
- Better handling of simple conditions containing only variables
- Better identification of compound declarations
- Better handling of throw and catch statements
- Support for -std=gnu++0x and -std=c++0x command line arguments
- Improved support for c++0x standard (still incomplete)
- With Goanna Central Linux - ability to use specific rc files
- Goannacc now correctly identifies versions of GCC that are built for different targets
As usual, all current users get the upgrade for free. If you were a trial user in the past and need a trial extension visit: http://redlizards.com/trial-extension
Copy control crash course
As part of our efforts to expand the scope of Goanna’s C++ checks, we decided to look into copy control, since this backbone of class architecture can also cause plenty of problems. The most common bugs relating to copy control are memory leaks, which are hard to identify and track down, as they will generally not cause the program to crash. Therefore, they are a priority for us to find.
In addition to finding and warning about the most common flaws in copy control functions, we decided to take the opportunity to cover some of the rarer problems, too. Our ultimate aim was to give some kind of useful warning in any case of potential misuse of a class, since copy control is something that a lot of people can have trouble grasping.
While many of our copy control checks warn for convention violations rather than definite bugs, they all combine to ensure that classes follow widely-accepted best practices, to improve the overall readability and robustness of code. As a case study, I’ll demonstrate the construction of a simple class, showing the bugs and suggestions that Goanna points out along the way.
We’ll start with a class with an int pointer. Of course, if we want the pointer to point somewhere, we need to allocate some memory for it, so we’ll do that in a constructor.
1 class MyClass {
2 public:
3 MyClass(); //constructor
4 private:
5 int* xp;
6 };
7
8 //default constructor
9 MyClass::MyClass(){
10 xp = new int[10];
11 }
We all know what the problem with this is:
cop.cc:9: warning: Goanna[COP-dtor] Missing destructor for class `MyClass' whose function `MyClass::MyClass()' allocates memory
cop.cc:9: warning: Goanna[COP-member-uninit] Not all members initialized in this constructor
If there is no explicit destructor, the compiler will only call the synthesized destructor, which is a problem for our allocated memory. The synthesized destructor will release the pointer xp, but not the memory allocated to it - this is our responsibility.
The second warning is there because even though we have allocated memory to ‘xp’, we have failed to initialize the values in the array, which should really be done in the constructor.
Because we’re lazy, let’s define an empty destructor to make the warnings go away (because that’s what half of programming is all about). Here’s the updated class:
1 class MyClass {
2 public:
3 MyClass(); //default constructor
4 ~MyClass(){} //destructor
5 private:
6 int* xp;
7 };
8
9 //default constructor
10 MyClass::MyClass(){
11 xp = new int[10];
12 for (int i=0; i!=10; ++i){
13 xp[i] = 0;
14 }
15 }
Alas, Goanna still isn’t happy:
cop.cc:4: warning: Goanna[COP-dealloc-dtor] Class field `xp' has memory allocated in a constructor that is not freed in the destructor
Let’s release the memory in the destructor.
17 //destructor
18 MyClass::~MyClass(){
19 delete[] xp;
20 }
Now our class is free from bugs. So when we run Goanna, we should get the all clear. Right?
cop.cc:18: warning: Goanna[COP-assign-op] Missing assignment operator for class `MyClass' which uses dynamic memory allocation
cop.cc:18: warning: Goanna[COP-copy-ctor] Missing copy constructor for class `MyClass' which uses dynamic memory allocation
Wrong! It seems we’re just digging ourselves deeper. The destructor is not the only thing the compiler will synthesize. In fact, for any class, it will synthesize up to 3 functions if they are not explicitly defined:
- Destructor - there will always be one synthesized to release stack memory
- Copy constructor - a constructor taking in a reference of the class type
- Assignment operator - operator=() defined for the class
Even though we have defined a default constructor (with no parameters), the compiler will still synthesize a copy constructor if we have not provided one. The copy constructor is used when instances of a class are copied, for example, passed as parameters or used in containers. Surprisingly, it is also invoked when an instance of the class is initialized at its declaration. For example:
MyClass a; //default constructor MyClass b(a); //copy constructor MyClass c = a; //copy constructor MyClass d; d=a; //default constructor; assignment operator
It can sometimes be confusing which operators or functions are being invoked, and for such reasons it is important to ensure that all functions are explicitly defined when appropriate. Basically, if a user-defined destructor is required, then an assignment operator and copy constructor will also be required. This is sometimes called the ‘Rule of 3′, suggesting that if one of these three is required, all probably are.
So let’s add our copy constructor and assignment operator to the class, and to avoid any additional warnings, we’d better make sure we allocate the memory required for xp. And while we’re at it, we should get rid of the magic number used for the array size:
1 class MyClass {
2 public:
3 MyClass(); //default constructor
4 MyClass(const MyClass& other); //copy constructor
5 void operator=(const MyClass& other); //assignment operator
6 ~MyClass(); //destructor
7 private:
8 static const int ARR_INIT_SIZE = 10;
9 int* xp;
10 };
...
18 //copy constructor
19 MyClass::MyClass(const MyClass& other){
20 xp = new int[ARR_INIT_SIZE];
21 for (int i=0; i!=ARR_INIT_SIZE; ++i){
22 xp[i] = other.xp[i];
23 }
24 }
25
26 //assignment operator
27 void MyClass::operator=(const MyClass& other){
28 delete[] xp; //reallocate the memory in case the size has changed
29 xp = new int[ARR_INIT_SIZE];
30 for (int i=0; i!=ARR_INIT_SIZE; ++i){
31 xp[i] = other.xp[i];
32 }
33 }
The class is starting to get bulky, but at least we can be sure we’re helping to make it safe for any creating, copying and destroying that we might be doing. However, Goanna isn’t quite happy with the class yet:
cop.cc:27: warning: Goanna[COP-assign-op-ret] Assignment operator `MyClass::operator=' does not return a non-const reference to `this'
cop.cc:29: warning: Goanna[COP-assign-op-self] Assignment operator `MyClass::operator=' does not check for self-assignment before allocating memory to a class member
The first warning is there because it is the convention that an assignment operator will return a reference to the target of the assignment. Such conventions exist so that all objects can be treated like primitive types, and to give the programmer more freedom to write intuitive code. For example:
MyClass a,b,c; (a = b) = c; (a = c).f();
The problem causing the second warning, is that self-assignment is generally perfectly legal code. For example:
MyClass c; c = c;
However, calling a class’ assignment operator on itself will cause problems if dynamic memory allocation takes place. In our assignment operator, we free the memory allocated to ‘xp’, before allocating a fresh store. If the class instance called the assignment operator on itself, then the memory that being copied from in the 4th line of the operator will also be fresh. This means that self-assignment will basically lead to the object being populated with uninitialized data, which is almost certainly not the intention of the programmer. To handle this, we simply need to ensure memory manipulation only takes place if ‘this’ and the parameter refer to different instances of the class.
Our assignment operator must be altered to fix these two problems:
25 //assignment operator
26 MyClass& MyClass::operator=(const MyClass& other){
27 if (this != &other){ //check for self-assignment
28 delete[] xp; //reallocate the memory in case the size has changed
29 xp = new int[ARR_INIT_SIZE];
30 for (int i=0; i!=ARR_INIT_SIZE; ++i){
31 xp[i] = other.xp[i];
32 }
33 }
34 return *this; //return a reference to 'this'
35 }
After these changes, Goanna will not give any more warnings, and the class will be very robust. Hopefully this has provided you with some insight to some of our copy control checks, and given you a quick revision on some of the important things to keep in mind when creating classes. We’re still trying to expand our range of C++ checks further, to include checks on the proper use of iterators, containers and exception handling, and many other constructs.
Here is our completed class:
1 class MyClass {
2 public:
3 MyClass(); //constructor
4 MyClass(const MyClass& other); //copy constructor
5 MyClass& operator=(const MyClass& other); //assignment operator
6 ~MyClass(); //destructor
7 private:
8 static const int ARR_INIT_SIZE = 10;
9 int* xp;
10 };
11
12 //default constructor
13 MyClass::MyClass(){
14 xp = new int[ARR_INIT_SIZE];
15 for (int i=0; i!=ARR_INIT_SIZE; ++i){
16 xp[i] = 0;
17 }
18 }
19
20 //copy constructor
21 MyClass::MyClass(const MyClass& other){
22 xp = new int[ARR_INIT_SIZE];
23 for (int i=0; i!=ARR_INIT_SIZE; ++i){
24 xp[i] = other.xp[i];
25 }
26 }
27
28 //assignment operator
29 MyClass& MyClass::operator=(const MyClass& other){
30 if (this != &other){ //check for self-assignment
31 delete[] xp; //reallocate the memory in case the size has changed
32 xp = new int[ARR_INIT_SIZE];
33 for (int i=0; i!=ARR_INIT_SIZE; ++i){
34 xp[i] = other.xp[i];
35 }
36 }
37 return *this; //return a reference to 'this'
38 }
39
40 //destructor
41 MyClass::~MyClass(){
42 delete[] xp;
43 } Experiments with F#
A couple of customers have asked for a command-line tool to run Goanna over their Visual Studio projects, similar to the way the Linux command-line tool works. The difficult bit for such a tool is to translate the information in a project file to the appropriate arguments to the core Goanna executable, goannacc.exe on Windows. We already have code to do just that in the Visual Studio extension.
The Goanna VS extensions for VS2005/2008/2010 are written in C#, because there’s a wizard that generates a simple extension in C#. From that starting point, we (meaning I) built the current extensions. If there hadn’t been the wizard, I would have written the extensions in F#, because I prefer the functional style of programming. So I decided to write the command-line tool, that was an opportunity to try out F# in earnest. I’d written a wee bit of F# before; here was a chance to try it out on production code.
When writing the command-line tool, my main concern was, how easy would it be to pull in the C# code that does the project-to-command-line translation. It was very easy: just add an F# project reference to the .DLL containing the code, open the namespace, and I was good to go. I had to make some C# classes explicitly public for visibility, but that was the only change I needed to make.
Programming in F# is very much like programming in OCaml, a language I’ve used off and on for maybe 15 years. Nice thing: instead of the clumsy “delegate” syntax of C#, you can just pass a function argument to another function. Not so nice: in VS2010, the editor does not seem to auto-format F# code, the way it does with C# code (the Ctrl-E F magic). And the editor’s Intellisense feature does not appear to suggest variable names that are in scope. Also: although there are surely good reasons for it, the F# list type is distinct from the C#/.Net System.Collections list type, which is slightly maddening. Finally: I have to build the tool for the various VS versions in slightly different ways, and the conditional compilation facility works for that — but why are there no boolean operations allowed, as you have in C#?
Here’s an example of code that uses .Net lists instead of F# lists:
let expandedProjs = new System.Collections.Generic.List() in
while projsIter.MoveNext() do
expandedProjs.AddRange(ProjectUtil.expandProject(projsIter.Current :?> EnvDTE.Project))
done;
Ooof.
When the command-line tool starts, it fires up an instance of Visual Studio, no GUI. That way it can get information about solutions and projects from VS, like default include paths and configuration information, using code originally written for the extensions. Sometimes the calls to VS fail with COM retry errors, so those calls are done in a loop containing a try-with block. When the tool finishes, or the user hits Ctrl-C, it gracefully shuts down VS. I often run the tool from a Cygwin shell, and I haven’t yet found a way to trap Cygwin SIGKILL signals, so that VS is still running afterwards.
There’s still some work to do on the command-line tool, like deciding what kind of output it should produce, but it’s basically there. Let me know if you’d like to try it out before we make it generally available. The tool is tentatively called “GoRun”, and its syntax is:
GoRun sln-file [projName ...] ...
That is, you supply one or more solution files, and for each solution file, zero or more project names. If you don’t supply project names, GoRun invokes Goanna on all the projects in the solution, otherwise only those specified.
A complaint: Soon after VS2010 was released, the MSDN site was updated with all-new documentation for the .Net libraries. But the only language there’s documentation for is C# (OK, sometimes J#) . In the type signatures, there are no hyperlinks for keywords (like public, final, etc.) and types. You can’t tell when a type is really a forall-quantified type variable. It wouldn’t be much harder to do these pages right.
One last comment: programming in F# doesn’t feel all that different than programming in C#, though the code is more concise. You definitely feel the presence of .Net every step of the way, and there’s statefulness lurking everywhere. Functional programming for the masses … sort of!
When is a for loop like a do .. while loop?
At Red Lizard Software, we care about providing the most accurate static analysis for your cpu cycle. Therefore, we spend a lot of our time thinking about the nature of false positives (when Goanna gives a warning about completely reasonable code) and how to avoid them.
One class of false positives we have noticed recently happens when you want to warn about an action that must occur on all execution paths. These properties might be expressed as “you must initialise all variables on all paths before accessing their values” for some definitions of initialise and access. A problem with these kinds of requirements appears when the initialisation of a variable is performed within a looping construct, and then access after the loop. This loop is usually designed to execute at least once (thus initialising the variable at least once) and so the programmer knows that the access after the loop is perfectly valid. Goanna has historically not been very good at identifying this false positive and will often warn anyway because there is an execution path that might not initialise the variable, the path where the condition evaluates to false. This is probably a case where the programmer should have used a do .. while loop to convey the desired semantics of the loop, but given that do .. while loops are not as popular as for loops, Goanna needs to be able to deal with this scenario.
There are two steps to making Goanna more intelligent about loops. The first step is identifying when a for or while loop should be represented as a do .. while, and the second is presenting this information to Goannas internal analysis engine.
In order to determine that a loop will execute at least once, it may be simpler to ask the inverse question. When will a loop not execute at least once? A sub question of this is when will we not know if a loop can execute at least once? This is actually much easier to answer because it can be boiled down to a structural condition. If the condition of the loop contains global variable references or function calls, then it is almost impossible to determine if a loop will execute only once. So what is left? Loops that contain only literals and local variable references. Parameters are a trickier issue since each call to the function is potentially different. With additional interprocedural analysis it may be possible to determine the boundaries of function parameters accurately but at present these loops can be ignored as well. The only thing left to do is to determine the state of the variables used in the loop condition right before it is evaluated and then evaluate the condition.
The analysis engine of Goanna works upon what is known as a control flow graph. This graph is created by looking at the source tree and determining which operations happen in which order. So the best way to present this modification of a for loop is through modifications to the control flow graph. Specifically we would like to create a copy of the control flow graph of the loops condition and wire up the rest of the graph such that there is a direct path through this path to the body of the for loop. The graph must also go into this new condition instead of into the old condition in order for the modification to be complete.
After implementing this change we have noticed that there is a small drop in the number of certain types of false positives, specifically in the SPC-uninit-var-some, with no impact on the runtime performance of the Goanna analysis engine. We hope to roll this improvement into the next release of the Goanna static analysis product line.
Goanna Command Style
Most users will use Goanna integrated into their development environment, either Visual Studio or Eclipse. However, we also have a command line version called Goanna Central. Since I am mostly working on the analysis engine this is the version I use most often. And part of this entails to find open source projects and run Goanna over it. So, if you have an open source project, we might be watching you :)
Most open source projects provide configure scripts to generate makefiles. If that is the case using Goanna is a matter of configuring it with Goanna. There are two executables, goannacc and goannac++, that behave like gcc and g++. Configuring then just means to execute:
goanna@KITTYHAWK:~$ ./configure CC=goannacc CXX=goannac++
After this you can make you project as you are used to, with the difference that you will get feedback from Goanna.
Sometimes open source project do not provide a configure script. Last week I got my hands on an open source model checker - it is always some guilty pleasure to model check a model checker - and this project only included a makefile. Once all the necessary libraries were installed - the once provided were incompatible with my machine - such that the project could be build with g++, all that remained was to edit the makefile. It is always exciting to edit a file that says right at the top: Automatically-generated file. Do not edit!. To use Goanna required to find all occurrences of, in this case, g++ and replace them with goannac++. And then to make the project.
The output looks like this:
Building file: ../src/Ned.cpp
Invoking: GCC C++ Compiler
goannac++ -DDEBUG -I../include -O0 -g3 -Wall -c -fmessage-length=0 -Wextra -MMD -MP -MF”src/Ned.d”
Goanna - analyzing file ../src/Ned.cpp
Number of functions: 3
../src/Ned.cpp:28: warning: Goanna[COP-assign-op] Missing assignment operator for class `Ned’ which uses dynamic memory allocation
../src/Ned.cpp:28: warning: Goanna[COP-copy-ctor] Missing copy constructor for class `Ned’ which uses dynamic memory allocation
Total runtime : 6.65 seconds
That was about all it took. BTW: Kittyhawk is the name of my machine, and it is aptly named.
Goanna Studio 2.0
It is out! We just released a major upgrade to Goanna Studio version 2.0. There has been a lot of work going into the new version and some of the new key features include:
- Full (whole program) interprocedural analysis to track effects across functions and files
- Incremental analysis to minimize time for reanalyzing files/projects
- Around 100 classes of checks, up almost 70% compared to the previous release
- Much improved precision and elimination of some existing false positives
- Improved Path Simulator to display error traces
- New project reporting mechanism and export facilities
For existing customers:
- We are also happy to announce that all existing customers have the possibility to upgrade to 2.0 free of charge!
- If you were a trial user in the past and need a trial extension visit: http://redlizards.com/trial-extension
Overall, the new version is another leap forward and enables to detect more and deeper critical issues early in the development cycle.