Table of Contents
Introduction
This blog post is dedicated to SPO600's Project - Stage 1
, where I particularly work with GCC. If you haven't read the previous post where I described the steps to build GCC compiler on Aarch64 and x86_64 servers and eventually compared them, I highly recommend doing so!. Therefore, you'd 100% understand what, why and how I do in the current post!
Stage one helps students prepare their environment and GCC for the second stage, where the major "heavy lifting" will happen. During this stage, I am creating Basic GCC Pass for the current development version of the GCC compiler which:
- Iterates through the code being compiled.
- Prints the name of every function being compiled.
- Prints a count of the number of basic blocks in each function.
- Prints a count of the number of gimple statements in each function.
Modern GCC has poor documentation on how to create a pass, so I referred to video-lectures and documentation provided and created by our professor:
Steps To Create a Pass
Honestly, these steps do not require extraordinary knowledge but patience and attention. Following these steps and provided resources, anyone may reproduce whatever has been done in this stage. However, it is logical since we were notified that it is a preparation of the environments.
What is GCC Pass?
"A GCC pass is a modular component within the GNU Compiler Collection that performs a specific transformation or analysis task during the compilation process. Each pass operates on an intermediate representation of the code (such as GIMPLE or RTL), executing in a predetermined order within the compilation pipeline to transform source code into machine code. Passes can analyze code, optimize it, clean up after other passes, or implement target-specific transformations, with the pass manager coordinating their execution. Compiler developers can create custom passes to extend GCC's functionality, as in your project where you're implementing a pass to count basic blocks and GIMPLE statements within functions." - Claude AI (Sonnet 3.7)
Back to square one!
Step 1 - Write a Pass
Once I built a GCC compiler, I had to look over the GCC passes
to understand how it looked and pick one of them as a template. To do so, I went to the source of gcc, where I was able to find those passes.
Since I cloned GCC from the git repository, my source code is located at ~/git/gcc
:
Move to the
gcc
sub-directory:cd gcc
, you will get to~/git/gcc/gcc
where is located actual compiler implementation.Look for the files starting with
tree-*.cc
ortree-*.c
for passes that work on the tree/GIMPLE representation:
ll tree*.cc
You will find this kind of list of the passes implemented by GCC developers:
Pick one of these templates as the starting point.
The professor's first example was used and found at gcc/gcc/tree-nrv.cc
.
Throughout this stage, I'd been reproducing professors' code to make sure that I was going along.
Test Pass may be found in my git repository, simply copy it if you want to keep things simple, and go along with this tutorial: click here to see the source code
This test pass simply outputs all of the compiled functions in the dump file.
My final source code implementation for the pass may be found here: final pass' source code
This implementation shows the names of each function alongside counting the basic blocks and gimple statements. Eventually, shows the total numbers after every function. NOTE: Outputs will be presented as the final step
Step 2 - Registering the Pass
It is very important to register the pass in passes.def
in order for GCC to recognize the custom pass. Otherwise, it won't work.
This file is located at ~/git/gcc/gcc/passes.def
. This file processes a lot of passes during the compilation therefore the order is important, I decided to do the same as the professor and put it under:
NEXT_PASS (tree_nrv);
My modified file looks like this:
...
NEXT_PASS (tree_nrv);
NEXT_PASS (tree_amullagaliev);
...
Step 3 - Add Object File
I added the object file for my pass to the file Makefile.in
in the OBJS section. My source file is tree-amullagaliev.cc
therefore, I added tree-amullagaliev.o
to the OBJS list.
It should look something like this:
/* existing content before the modification*/
tree-dfa.o \
tree-amullagaliev.o \
tree-diagnostic.o \
/* existing content continues */
Step 4 - Modify Header File
To make my pass recognizable in earlier modified passes.def
, I had to declare it in the tree-pass.h
header file. Which can be found at ~/git/gcc/gcc/tree-pass.h
.
Add this declaration in order to allow GCC to recognize the function:
extern gimple_opt_pass *make_tree_amullagaliev(gcc::contenxt *ctxt);
Step 5 - Re-create the Makefile
As one of the final steps, I had to re-create the Makefile
inside of my build tree, which is ~/gcc-build-001
. I had to do it in order for changes in the Makefile.in
to be recognized - the build system wouldn't automatically detect the changes only to the Makefile.in
.
The easiest way is to delete Makefile
inside of the gcc sub-directory inside of the build tree, which can be found at ~/gcc-build-001/gcc/Makefile
. NOTE: This method allows me to prevent rebuilding everything!
IMPORTANT! Don't get me wrong, the deletion of Makefile
inside of ~/gcc-build-001/gcc
is required only when we make changes inside of the single file: ~/git/gcc/gcc/Makefile.in
, future modification of the pass doesn't require this step!
Here's how it looks in the bash:
cd ~/gcc-build-001/gcc
rm Makefile
cd ..
time make -j$(nproc) |& tee buid-xxx.log
Results
Once, I have rebuilt the GCC
with the brand new pass, I am ready to test it!
First of all, I had to write a test code, which can be found here, I am leaving for you, so you could reuse it, also looks like the professor's code:
#include <stdio.h>
int foo(int p1, int p2) {
return p1*p2;
}
int main() {
int a = 12;
int b = 13;
int c;
c = foo(a, b);
printf("%d\n", c);
return 0;
}
The next thing that I had to do was to create a Makefile
inside the test directory, which was also uploaded to the git repository:
BINARIES=hello
CCFLAGS=-g -O0 -fno-builtin -fdump-tree-amullagaliev
all: ${BINARIES}
hello: hello.c
gcc ${CCFLAGS} -o hello hello.c
clean:
rm ${BINARIES} *.o || true
Notice! How using the flags I marked that I want to see the dump file for the pass I have just implemented: -fdump-tree-amullagaliev
Dump File Outputs
Upon the completion of all the steps above, I had to compile the sample code using make
command:
make hello
Two files will appear hello
and hello.c.265t.amullagaliev
.
Here are the results of the dump file using two passes: test-pass and final pass:
Test-Pass Output:
;; Function foo (foo, funcdef_no=0, decl_uid=3929, cgraph_uid=1, symbol_order=0)
=== FUnction 1 Name 'printf' ===
=== FUnction 2 Name 'main' ===
=== FUnction 3 Name 'foo' ===
#### End amullagaliev diagnostics, start regular dump of current gimple ####
int foo (int p1, int p2)
{
int D.3937;
int _3;
<bb 2> :
_3 = p1_1(D) * p2_2(D);
<bb 3> :
<L0>:
return _3;
}
;; Function main (main, funcdef_no=1, decl_uid=3931, cgraph_uid=2, symbol_order=1)
=== FUnction 1 Name 'printf' ===
=== FUnction 2 Name 'main' ===
=== FUnction 3 Name 'foo' ===
#### End amullagaliev diagnostics, start regular dump of current gimple ####
int main ()
{
int c;
int b;
int a;
int D.3939;
int _7;
<bb 2> :
a_1 = 12;
b_2 = 13;
c_5 = foo (a_1, b_2);
printf ("%d\n", c_5);
_7 = 0;
<bb 3> :
<L0>:
return _7;
}
As you can see, only function names were produced.
Final Pass Output
;; Function foo (foo, funcdef_no=0, decl_uid=2337, cgraph_uid=1, symbol_order=0)
===== Basic block count: 1 =====
----- Statement count: 1 -----
_3 = p1_1(D) * p2_2(D);
===== Basic block count: 2 =====
----- Statement count: 2 -----
<L0>:
----- Statement count: 3 -----
# VUSE <.MEM_4(D)>
return _3;
------------------------------------
Total Basic Blocks: 2
Total Gimple Statements: 3
------------------------------------
int foo (int p1, int p2)
{
int D.2345;
int _3;
<bb 2> :
_3 = p1_1(D) * p2_2(D);
<bb 3> :
<L0>:
return _3;
}
;; Function main (main, funcdef_no=1, decl_uid=2339, cgraph_uid=2, symbol_order=1)
===== Basic block count: 1 =====
----- Statement count: 1 -----
a_1 = 12;
----- Statement count: 2 -----
b_2 = 13;
----- Statement count: 3 -----
# .MEM_4 = VDEF <.MEM_3(D)>
c_5 = foo (a_1, b_2);
----- Statement count: 4 -----
# .MEM_6 = VDEF <.MEM_4>
printf ("%d\n", c_5);
----- Statement count: 5 -----
_7 = 0;
===== Basic block count: 2 =====
----- Statement count: 6 -----
<L0>:
----- Statement count: 7 -----
# VUSE <.MEM_6>
return _7;
------------------------------------
Total Basic Blocks: 2
Total Gimple Statements: 7
------------------------------------
int main ()
{
int c;
int b;
int a;
int D.2347;
int _7;
<bb 2> :
a_1 = 12;
b_2 = 13;
c_5 = foo (a_1, b_2);
printf ("%d\n", c_5);
_7 = 0;
<bb 3> :
<L0>:
return _7;
}
This one results in following all the requirements set by professor, they are all written at the introduction.
These two dump-files may be found on github as well: click here.
Code Limitations
I could list you another whole blog of limitations of this code, however this pass serves as a counter, and function name printer :)
Regarding the capabilities, everything listed in the introduction.
Conclusion
I hope someone found this post helpful, and was able to reproduce everything written here!
Honestly, I faced some minor challenges. First one was the attention, I had to keep track of many files always made sure that I added the pass recognition inside of the gcc source file. Secondly, time it took to rebuild Makefile
. For some reasons, it took 15 minutes on x86
, and just 2 minutes on Aarch64
. I think it happened due to the high load of the servers by other students, everyone was trying to build. I won't be surprised if someone was building gcc
from scratch :D
In my opinion, this is one of the most interesting courses, I am doing something new and mind-blowing. I really appreciate professor's efforts and explanations, even while we don't have that much documentation, he still manages to deliver the content by writing his own tutorials, and explaining clearly everything in his videos.
Will see you in next blogs, I have a lot of things to do: Lab03
, Lab05
and the rest of the project stages!
Top comments (0)