DEV Community

Cover image for SPO600: Project Stage 1 - Basic GCC Pass
Amir Mullagaliev
Amir Mullagaliev

Posted on

SPO600: Project Stage 1 - Basic GCC Pass

Table of Contents

  1. Introduction
  2. Steps To Create a Pass
  3. Results
  4. Conclusion

Introduction

This blog post is dedicated to SPO600's Project - Stage 1, where I particularly work with GCC. If you haven't read the previous post where I described the steps to build GCC compiler on Aarch64 and x86_64 servers and eventually compared them, I highly recommend doing so!. Therefore, you'd 100% understand what, why and how I do in the current post!

Stage one helps students prepare their environment and GCC for the second stage, where the major "heavy lifting" will happen. During this stage, I am creating Basic GCC Pass for the current development version of the GCC compiler which:

  • Iterates through the code being compiled.
  • Prints the name of every function being compiled.
  • Prints a count of the number of basic blocks in each function.
  • Prints a count of the number of gimple statements in each function.

Modern GCC has poor documentation on how to create a pass, so I referred to video-lectures and documentation provided and created by our professor:

Steps To Create a Pass

Honestly, these steps do not require extraordinary knowledge but patience and attention. Following these steps and provided resources, anyone may reproduce whatever has been done in this stage. However, it is logical since we were notified that it is a preparation of the environments.

What is GCC Pass?

"A GCC pass is a modular component within the GNU Compiler Collection that performs a specific transformation or analysis task during the compilation process. Each pass operates on an intermediate representation of the code (such as GIMPLE or RTL), executing in a predetermined order within the compilation pipeline to transform source code into machine code. Passes can analyze code, optimize it, clean up after other passes, or implement target-specific transformations, with the pass manager coordinating their execution. Compiler developers can create custom passes to extend GCC's functionality, as in your project where you're implementing a pass to count basic blocks and GIMPLE statements within functions." - Claude AI (Sonnet 3.7)

Back to square one!

Step 1 - Write a Pass

Once I built a GCC compiler, I had to look over the GCC passes to understand how it looked and pick one of them as a template. To do so, I went to the source of gcc, where I was able to find those passes.

Since I cloned GCC from the git repository, my source code is located at ~/git/gcc:

  • Move to the gcc sub-directory: cd gcc, you will get to ~/git/gcc/gcc where is located actual compiler implementation.

  • Look for the files starting with tree-*.cc or tree-*.c for passes that work on the tree/GIMPLE representation:

ll tree*.cc
Enter fullscreen mode Exit fullscreen mode

You will find this kind of list of the passes implemented by GCC developers:

Image description

Pick one of these templates as the starting point.

The professor's first example was used and found at gcc/gcc/tree-nrv.cc.

Throughout this stage, I'd been reproducing professors' code to make sure that I was going along.

Test Pass may be found in my git repository, simply copy it if you want to keep things simple, and go along with this tutorial: click here to see the source code

This test pass simply outputs all of the compiled functions in the dump file.

My final source code implementation for the pass may be found here: final pass' source code

This implementation shows the names of each function alongside counting the basic blocks and gimple statements. Eventually, shows the total numbers after every function. NOTE: Outputs will be presented as the final step

Step 2 - Registering the Pass

It is very important to register the pass in passes.def in order for GCC to recognize the custom pass. Otherwise, it won't work.

This file is located at ~/git/gcc/gcc/passes.def. This file processes a lot of passes during the compilation therefore the order is important, I decided to do the same as the professor and put it under:

NEXT_PASS (tree_nrv);
Enter fullscreen mode Exit fullscreen mode

My modified file looks like this:

...
NEXT_PASS (tree_nrv);
NEXT_PASS (tree_amullagaliev);
...
Enter fullscreen mode Exit fullscreen mode

Step 3 - Add Object File

I added the object file for my pass to the file Makefile.in in the OBJS section. My source file is tree-amullagaliev.cc therefore, I added tree-amullagaliev.o to the OBJS list.

It should look something like this:

/* existing content before the modification*/
tree-dfa.o \
tree-amullagaliev.o \
tree-diagnostic.o \
/* existing content continues */
Enter fullscreen mode Exit fullscreen mode

Step 4 - Modify Header File

To make my pass recognizable in earlier modified passes.def, I had to declare it in the tree-pass.h header file. Which can be found at ~/git/gcc/gcc/tree-pass.h.

Add this declaration in order to allow GCC to recognize the function:

extern gimple_opt_pass *make_tree_amullagaliev(gcc::contenxt *ctxt);
Enter fullscreen mode Exit fullscreen mode

Step 5 - Re-create the Makefile

As one of the final steps, I had to re-create the Makefile inside of my build tree, which is ~/gcc-build-001. I had to do it in order for changes in the Makefile.in to be recognized - the build system wouldn't automatically detect the changes only to the Makefile.in.

The easiest way is to delete Makefile inside of the gcc sub-directory inside of the build tree, which can be found at ~/gcc-build-001/gcc/Makefile. NOTE: This method allows me to prevent rebuilding everything!

IMPORTANT! Don't get me wrong, the deletion of Makefile inside of ~/gcc-build-001/gcc is required only when we make changes inside of the single file: ~/git/gcc/gcc/Makefile.in, future modification of the pass doesn't require this step!

Here's how it looks in the bash:

cd ~/gcc-build-001/gcc
rm Makefile
cd ..
time make -j$(nproc) |& tee buid-xxx.log
Enter fullscreen mode Exit fullscreen mode

Results

Once, I have rebuilt the GCC with the brand new pass, I am ready to test it!

First of all, I had to write a test code, which can be found here, I am leaving for you, so you could reuse it, also looks like the professor's code:

#include <stdio.h>

int foo(int p1, int p2) {
    return p1*p2;
}

int main() {
    int a = 12;
    int b = 13;

    int c;

    c = foo(a, b);

    printf("%d\n", c);

    return 0;
}
Enter fullscreen mode Exit fullscreen mode

The next thing that I had to do was to create a Makefile inside the test directory, which was also uploaded to the git repository:

BINARIES=hello
CCFLAGS=-g -O0 -fno-builtin -fdump-tree-amullagaliev

all: ${BINARIES}

hello: hello.c
    gcc ${CCFLAGS} -o hello hello.c

clean:
    rm ${BINARIES} *.o || true
Enter fullscreen mode Exit fullscreen mode

Notice! How using the flags I marked that I want to see the dump file for the pass I have just implemented: -fdump-tree-amullagaliev

Dump File Outputs

Upon the completion of all the steps above, I had to compile the sample code using make command:

make hello
Enter fullscreen mode Exit fullscreen mode

Two files will appear hello and hello.c.265t.amullagaliev.

Here are the results of the dump file using two passes: test-pass and final pass:

Test-Pass Output:

;; Function foo (foo, funcdef_no=0, decl_uid=3929, cgraph_uid=1, symbol_order=0)

=== FUnction 1 Name 'printf' ===
=== FUnction 2 Name 'main' ===
=== FUnction 3 Name 'foo' ===


#### End amullagaliev diagnostics, start regular dump of current gimple ####


int foo (int p1, int p2)
{
  int D.3937;
  int _3;

  <bb 2> :
  _3 = p1_1(D) * p2_2(D);

  <bb 3> :
<L0>:
  return _3;

}



;; Function main (main, funcdef_no=1, decl_uid=3931, cgraph_uid=2, symbol_order=1)

=== FUnction 1 Name 'printf' ===
=== FUnction 2 Name 'main' ===
=== FUnction 3 Name 'foo' ===


#### End amullagaliev diagnostics, start regular dump of current gimple ####


int main ()
{
  int c;
  int b;
  int a;
  int D.3939;
  int _7;

  <bb 2> :
  a_1 = 12;
  b_2 = 13;
  c_5 = foo (a_1, b_2);
  printf ("%d\n", c_5);
  _7 = 0;

  <bb 3> :
<L0>:
  return _7;

}
Enter fullscreen mode Exit fullscreen mode

As you can see, only function names were produced.

Final Pass Output

;; Function foo (foo, funcdef_no=0, decl_uid=2337, cgraph_uid=1, symbol_order=0)

===== Basic block count: 1 =====
----- Statement count: 1 -----
_3 = p1_1(D) * p2_2(D);
===== Basic block count: 2 =====
----- Statement count: 2 -----
<L0>:
----- Statement count: 3 -----
# VUSE <.MEM_4(D)>
return _3;
------------------------------------
Total Basic Blocks: 2
Total Gimple Statements: 3
------------------------------------

int foo (int p1, int p2)
{
  int D.2345;
  int _3;

  <bb 2> :
  _3 = p1_1(D) * p2_2(D);

  <bb 3> :
<L0>:
  return _3;

}



;; Function main (main, funcdef_no=1, decl_uid=2339, cgraph_uid=2, symbol_order=1)

===== Basic block count: 1 =====
----- Statement count: 1 -----
a_1 = 12;
----- Statement count: 2 -----
b_2 = 13;
----- Statement count: 3 -----
# .MEM_4 = VDEF <.MEM_3(D)>
c_5 = foo (a_1, b_2);
----- Statement count: 4 -----
# .MEM_6 = VDEF <.MEM_4>
printf ("%d\n", c_5);
----- Statement count: 5 -----
_7 = 0;
===== Basic block count: 2 =====
----- Statement count: 6 -----
<L0>:
----- Statement count: 7 -----
# VUSE <.MEM_6>
return _7;
------------------------------------
Total Basic Blocks: 2
Total Gimple Statements: 7
------------------------------------

int main ()
{
  int c;
  int b;
  int a;
  int D.2347;
  int _7;

  <bb 2> :
  a_1 = 12;
  b_2 = 13;
  c_5 = foo (a_1, b_2);
  printf ("%d\n", c_5);
  _7 = 0;

  <bb 3> :
<L0>:
  return _7;

}
Enter fullscreen mode Exit fullscreen mode

This one results in following all the requirements set by professor, they are all written at the introduction.

These two dump-files may be found on github as well: click here.

Code Limitations

I could list you another whole blog of limitations of this code, however this pass serves as a counter, and function name printer :)

Regarding the capabilities, everything listed in the introduction.

Conclusion

I hope someone found this post helpful, and was able to reproduce everything written here!

Honestly, I faced some minor challenges. First one was the attention, I had to keep track of many files always made sure that I added the pass recognition inside of the gcc source file. Secondly, time it took to rebuild Makefile. For some reasons, it took 15 minutes on x86, and just 2 minutes on Aarch64. I think it happened due to the high load of the servers by other students, everyone was trying to build. I won't be surprised if someone was building gcc from scratch :D

In my opinion, this is one of the most interesting courses, I am doing something new and mind-blowing. I really appreciate professor's efforts and explanations, even while we don't have that much documentation, he still manages to deliver the content by writing his own tutorials, and explaining clearly everything in his videos.

Will see you in next blogs, I have a lot of things to do: Lab03, Lab05 and the rest of the project stages!

Top comments (0)