Project Stage 2: Project Update & Comparing Cloned Functions
Hi there! I've returned with further project progress. I'll demonstrate my method for comparing cloned functions and deciding whether to prune them today.
A Helpful Time-Saving Tip
Instead of repeatedly typing the full path to our custom GCC or running the lengthy export command, I created a simple alias:
- I opened the bash configuration file:
nano ~/.bashrc
- Added this line at the end:
alias addgcc="export PATH=\"$HOME/gcc-test-001/bin:\$PATH\""
- Activated the changes:
source ~/.bashrc
- Now I can just type
addgcc
Whenever I need to switch to my custom compiler - Verify it worked with
which gcc
This small change saved me tons of time during development!
Implementing Function Comparison
Now, let's discuss how I implemented the comparison functionality.
Phase 1: Getting the Cloned Functions
In my previous article, I showed how I identify variant cloned functions. I used a map data structure where:
- The key is the base function name (like "foo")
- The value is a vector of clone names (like "foo.default", "foo.sve2")
My initial approach was to create a method that:
- Takes a pointer to this map
- Loops through the entries
- Creates two function pointers (initially null)
- Uses the
FOR_EACH_FUNCTION
macro to locate the actual function objects - Validates that both functions were found
- Prints a confirmation message
I designed this with the assumption that a function with a resolver will have exactly two clone variants.
Here's the code I wrote for this method...
// This function will compare the two variants for each base function.
// If the two functions are identical then it will print PRUNE, if not then NOPRUNE
void compare_the_cloned_functions(const std::map<std::string, std::vector<std::string>> &variantMap) {
// Lets loop and see if we have two variants
for (const auto &base : variantMap) {
// Get the key and values
const std::string &baseName = base.first;
const std::vector<std::string> &variantValues = base.second;
// We assume that each base function has two clones
if (variantValues.size()!=2) {
fprintf(dump_file, "The number of variants for %s are not %d", baseName.c_str(), 2);
continue;
}
// Instantiate function pointers
function *functionOne = nullptr;
function *functionTwo = nullptr;
// Same logic as before
cgraph_node *node;
// Loop through each function and return the function we need
FOR_EACH_FUNCTION(node) {
function *fun = node->get_fun();
// Validate it
if (!fun) {
continue;
}
std::string functionName(function_name(fun));
//
if (functionName == variantValues[0] || (functionName + ".default") == variantValues[0]) {
functionOne = fun;
}
else if (functionName == variantValues[1] || (functionName + ".default") == variantValues[1]) {
functionTwo = fun;
}
}
// Validate if both are valid
if (!functionOne || !functionTwo) {
fprintf(dump_file, "ERROR! Function 1 and function 2 could not be retrieved. Check the logic!\n");
}
else
{
fprintf(dump_file, "Successfully retrieved both function 1 and two, the addresses are as follows: %p, %p \n", (void*)functionOne, (void*)functionTwo);
}
}
}
After re-building my custom gcc build, I obtained the following output. Please see the output and image below.
See the output and image below.
[kbshah6@aarch64-002 test-clone]$ cat *-noprune*.kbshah6
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)
--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.sve2 (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
--------> Variant: scale_samples.sve2
--------> Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0xffff8dbb5a90, 0xffff8d829ea0
^^^^^^^^LOOK AT THIS LAST LINE!!!!^^^^^^^^^^^
My next course of action, if you follow my reasoning, would be to compare the two and determine whether they are similar using the function pointers. Now, this may seem easy, but I must admit that it took me a long time.
I used three checks to approach this logic, and because of Professor Chris Tyler, I knew which checks to finish.
Compare the basic block counts
Compare the GIMPLE statement counts
Compare the GIMPLE instructions
Step 1 - Compare basic block counts
We can presume that two functions are not the same if their basic blocks are different. I developed a straightforward method that returns the number of fundamental blocks in a function given as a parameter. Please refer to the reasoning below.
// Get the number of basic blocks
size_t get_number_of_basic_blocks(function* func) {
// Instantiate counter
size_t count = 0;
basic_block bb;
FOR_EACH_BB_FN(bb, func) {
count++;
}
return count;
}
Step 2 - Compare the GIMPLE statement counts
Same concept as the logic above, I simply compare the number of GIMPLE statements by again passing the function as the parameter.
// Get the number of statements inside the function
size_t get_number_of_gimple_statements(function* func) {
// Initialize basic block
basic_block bb;
// Hold number of GIMPLE values
std::vector < gimple * > allStatements;
FOR_EACH_BB_FN(bb, func) {
for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next( & gsi)) {
gimple * stmt = gsi_stmt(gsi);
allStatements.push_back(stmt);
}
}
if (allStatements.size() > 0) {
fprintf(dump_file, "Number of gimple statements is %zu\n", allStatements.size());
}
return allStatements.size();
}
Step 3 - Compare each GIMPLE instruction
If it passes each of the passes above, then the last one is to check each GIMPLE operation. The logic checks if it has the same number of operands (values used in the operation) and opcode (the type of operation). In this function, I pass in each function as a pointer.
// This function performs the gimple code check by checking the operands and number of statements
bool are_functions_identical_by_gimple_code_and_ops(function* f1, function* f2) {
basic_block bb1;
basic_block bb2;
// Create vector of gimple to store statements
std::vector < gimple * > stmts1, stmts2;
// Collect all statements from function 1
FOR_EACH_BB_FN(bb1, f1) {
for (gimple_stmt_iterator gsi = gsi_start_bb(bb1); !gsi_end_p(gsi); gsi_next( & gsi)) {
stmts1.push_back(gsi_stmt(gsi));
}
}
// Collect all statements from function 2
FOR_EACH_BB_FN(bb2, f2) {
for (gimple_stmt_iterator gsi = gsi_start_bb(bb2); !gsi_end_p(gsi); gsi_next( & gsi)) {
stmts2.push_back(gsi_stmt(gsi));
}
}
// Compare rhe statement counts
if (stmts1.size() != stmts2.size()) {
fprintf(dump_file, "Statement count mismatch: %zu vs %zu\n", stmts1.size(), stmts2.size());
return false;
}
// Compare GIMPLE codes and operand counts
for (size_t i = 0; i < stmts1.size(); ++i) {
gimple * s1 = stmts1[i];
gimple * s2 = stmts2[i];
enum gimple_code code1 = gimple_code(s1);
enum gimple_code code2 = gimple_code(s2);
unsigned ops1 = gimple_num_ops(s1);
unsigned ops2 = gimple_num_ops(s2);
if (code1 != code2) {
fprintf(dump_file, "GIMPLE code mismatch at statement %zu:\n", i);
fprintf(dump_file, "Function 1: ");
print_gimple_stmt(dump_file, s1, 0, TDF_NONE);
fprintf(dump_file, "\nFunction 2: ");
print_gimple_stmt(dump_file, s2, 0, TDF_NONE);
fprintf(dump_file, "\n");
return false;
}
if (ops1 != ops2) {
fprintf(dump_file, "Operand count mismatch at statement %zu:\n", i);
fprintf(dump_file, "Function 1: %u operands, Function 2: %u operands\n", ops1, ops2);
return false;
}
}
return true;
}
This concludes Phase 2 of this logic, and the only phase remaining is to output a statement indicating whether to PRUNE
or NOPRUNE
.
Phase 3 - Output PRUNE or NOPRUNE
The code for this was, hands down, the easiest part of the project (Thankfully). It’s a simple if statement that checks my identical flag.
// Set a flag
bool identical = true;
// Final flag set
if (basicBlockCountForFunctionTwo != basicBlockCountForFunctionOne || gimpleCountForFunctionTwo != gimpleCountForFunctionOne || !gimpleCodeAndOpsCheck) {
identical = false;
}
// Now just print the PRUNE / NOPRUNE MESSAGE
if (identical) {
fprintf(dump_file, "PRUNE: %s\n", baseName.c_str());
} else {
fprintf(dump_file, "NOPRUNE: %s\n", baseName.c_str());
}
Final Output - Testing on AArch64
I developed this pass and first added it into an AArch64 system. Therefore I am going to test it there first.
Testing Instructions
You've completed the development of your compiler pass and are now ready to test it on an AArch64 system first.
To test, go to the gcc-build-001 directory and run these commands:
time make -j$(nproc) |& tee third_build_log_analyzing_cloned_variants_final.log
make install
Next, unzip the test directory from /public/spo600-test-clone.tgz, then:
- Change to the test directory:
cd ~/spo600/examples/test-clone
- Run
make clean
- Run
make
To easily find your specific output files, use:
ls *kbshah6
This command will display just two files that end with your pass name.
[kbshah6@aarch64-002 test-clone]$ ls *kbshah6
clone-test-aarch64-noprune-clone-test-core.c.265t.kbshah6 clone-test-aarch64-prune-clone-test-core.c.265t.kbshah6
Output for NOPRUNE - AArch64
I simply executed the command: cat clone-test-aarch64-noprune-clone-test-core.c.265t.kbshah6
and I received the following output:
[kbshah6@aarch64-002 test-clone]$ cat clone-test-aarch64-noprune-clone-test-core.c.265t.kbshah6
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)
--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.sve2 (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
--------> Variant: scale_samples.sve2
--------> Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0xffff8a863a90, 0xffff8a4d7ea0
---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.sve2) Block Count 5
Function 2: (scale_samples) Block Count 32
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 130
---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.sve2) Gimple Count 24
Function 2: (scale_samples) Gimple Count 130
---------------------------------------------------------
Statement count mismatch: 24 vs 130
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples
Looking at the last line it correctly identifies that there is NOPRUNE. However, unfortunately this is not the case. You will see in the PRUNE example what I mean by that.
Output for PRUNE - AArch64
[kbshah6@aarch64-002 test-clone]$ cat clone-test-aarch64-prune-clone-test-core.c.265t.kbshah6
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)
--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.rng (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
--------> Variant: scale_samples.rng
--------> Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0xffff9254ba90, 0xffff921bfea0
---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.rng) Block Count 5
Function 2: (scale_samples) Block Count 32
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 130
---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.rng) Gimple Count 24
Function 2: (scale_samples) Gimple Count 130
---------------------------------------------------------
Statement count mismatch: 24 vs 130
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples
My comparison functions obviously have a problem, as you can see. As you can see from the output, I am certain that my detecting cloned logic functions. However, my comparison must be flawed in some way. I'll talk to Professor Chris Tyler about this because I tried for a few hours but was unable to determine the cause. Although this is new to me, it makes logical sense to me.
Nevertheless, I would like to end this piece by stating that even though this phase was really challenging, I believe I gained a lot of knowledge. My knowledge of AFMV, fundamental block identification, gimple statement counts, etc., has improved. These are the foundational ideas for creating something significant. I do have a lot to learn from the regrettable output above, and I hope to resolve the problem.
Final Output - Testing on x86
I completed the same steps above for x86 and listed the output on the terminal below:
Output for NOPRUNE - x86
kbshah6@x86-001:~/spo600/examples/test-clone$ cat clone-test-x86-noprune-clone-test-core.c.265t.kbshah6
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)
--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.arch_x86_64_v3 (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
--------> Variant: scale_samples.arch_x86_64_v3
--------> Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0x7effa3c1b750, 0x7effa3cde750
---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.arch_x86_64_v3) Block Count 5
Function 2: (scale_samples) Block Count 31
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 165
---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.arch_x86_64_v3) Gimple Count 24
Function 2: (scale_samples) Gimple Count 165
---------------------------------------------------------
Statement count mismatch: 24 vs 165
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples
PRUNE Output for AArch64
kbshah6@x86-001:~/spo600/examples/test-clone$ cat clone-test-x86-prune-clone-test-core.c.265t.kbshah6
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)
--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.popcnt (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
--------> Variant: scale_samples.popcnt
--------> Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0x7fc46541b750, 0x7fc4654de750
---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.popcnt) Block Count 5
Function 2: (scale_samples) Block Count 31
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 165
---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.popcnt) Gimple Count 24
Function 2: (scale_samples) Gimple Count 165
---------------------------------------------------------
Statement count mismatch: 24 vs 165
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples
Once again, there is something wrong with my logic… [ :( ]. I did try really hard to debug this but couldn’t fix it, unfortunately.
Next Steps
My next steps include discussing my solution with Professor Chris Tyler and fixing my solution with some assistance before I engage in the last phase.
Comments
Post a Comment