Hi there! I've returned with further project progress. I'll demonstrate my method for comparing cloned functions and deciding whether to prune them today.

A Helpful Time-Saving Tip
Instead of repeatedly typing the full path to our custom GCC or running the lengthy export command, I created a simple alias:

I opened the bash configuration file: nano ~/.bashrc
Added this line at the end: alias addgcc="export PATH=\"$HOME/gcc-test-001/bin:\$PATH\""
Activated the changes: source ~/.bashrc
Now I can just type addgcc Whenever I need to switch to my custom compiler
Verify it worked with which gcc

This small change saved me tons of time during development!

Implementing Function Comparison

Now, let's discuss how I implemented the comparison functionality.
Phase 1: Getting the Cloned Functions

In my previous article, I showed how I identify variant cloned functions. I used a map data structure where:

The key is the base function name (like "foo")
The value is a vector of clone names (like "foo.default", "foo.sve2")

My initial approach was to create a method that:

Takes a pointer to this map
Loops through the entries
Creates two function pointers (initially null)
Uses the FOR_EACH_FUNCTION macro to locate the actual function objects
Validates that both functions were found
Prints a confirmation message

I designed this with the assumption that a function with a resolver will have exactly two clone variants.

Here's the code I wrote for this method...

// This function will compare the two variants for each base function.
            // If the two functions are identical then it will print PRUNE, if not then NOPRUNE
            void compare_the_cloned_functions(const std::map<std::string, std::vector<std::string>> &variantMap) {
                // Lets loop and see if we have two variants
                for (const auto &base : variantMap) {

                    // Get the key and values
                    const std::string &baseName = base.first;
                    const std::vector<std::string> &variantValues = base.second;

                    // We assume that each base function has two clones
                    if (variantValues.size()!=2) {
                        fprintf(dump_file, "The number of variants for %s are not %d", baseName.c_str(), 2);
                        continue;
                    }

                    // Instantiate function pointers
                    function *functionOne = nullptr;
                    function *functionTwo = nullptr;

                    // Same logic as before
                    cgraph_node *node;
                    // Loop through each function and return the function we need
                    FOR_EACH_FUNCTION(node) {
                        function *fun = node->get_fun();

                        // Validate it
                        if (!fun) {
                            continue;
                        }
                        std::string functionName(function_name(fun));
                        // 
                        if (functionName == variantValues[0] || (functionName + ".default") == variantValues[0]) {
                            functionOne = fun;
                        }
                        else if (functionName == variantValues[1] || (functionName + ".default") == variantValues[1]) {
                            functionTwo = fun;
                        }
                    }

                    // Validate if both are valid
                    if (!functionOne || !functionTwo) {
                        fprintf(dump_file, "ERROR! Function 1 and function 2 could not be retrieved. Check the logic!\n");
                    }
                    else 
                    {
                        fprintf(dump_file, "Successfully retrieved both function 1 and two, the addresses are as follows: %p,  %p \n", (void*)functionOne, (void*)functionTwo);
                    }
                }
            }

After re-building my custom gcc build, I obtained the following output. Please see the output and image below.

See the output and image below.

[kbshah6@aarch64-002 test-clone]$ cat *-noprune*.kbshah6

;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)

--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.sve2 (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
 -------->  Variant: scale_samples.sve2
 -------->  Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0xffff8dbb5a90,  0xffff8d829ea0 

^^^^^^^^LOOK AT THIS LAST LINE!!!!^^^^^^^^^^^

Phase 2 - Comparing the two functions

My next course of action, if you follow my reasoning, would be to compare the two and determine whether they are similar using the function pointers. Now, this may seem easy, but I must admit that it took me a long time.

I used three checks to approach this logic, and because of Professor Chris Tyler, I knew which checks to finish.

Compare the basic block counts
Compare the GIMPLE statement counts
Compare the GIMPLE instructions

Step 1 - Compare basic block counts

We can presume that two functions are not the same if their basic blocks are different. I developed a straightforward method that returns the number of fundamental blocks in a function given as a parameter. Please refer to the reasoning below.

// Get the number of basic blocks
size_t get_number_of_basic_blocks(function* func) {
  // Instantiate counter
  size_t count = 0;

  basic_block bb;
  FOR_EACH_BB_FN(bb, func) {
    count++;
  }

  return count;
}

Step 2 - Compare the GIMPLE statement counts

Same concept as the logic above, I simply compare the number of GIMPLE statements by again passing the function as the parameter.

// Get the number of statements inside the function
size_t get_number_of_gimple_statements(function* func) {
  // Initialize basic block
  basic_block bb;

  // Hold number of GIMPLE values
  std::vector < gimple * > allStatements;

  FOR_EACH_BB_FN(bb, func) {
    for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next( & gsi)) {
      gimple * stmt = gsi_stmt(gsi);
      allStatements.push_back(stmt);
    }
  }

  if (allStatements.size() > 0) {
    fprintf(dump_file, "Number of gimple statements is %zu\n", allStatements.size());
  }

  return allStatements.size();
}

Step 3 - Compare each GIMPLE instruction

If it passes each of the passes above, then the last one is to check each GIMPLE operation. The logic checks if it has the same number of operands (values used in the operation) and opcode (the type of operation). In this function, I pass in each function as a pointer.

// This function performs the gimple code check by checking the operands and number of statements
bool are_functions_identical_by_gimple_code_and_ops(function* f1, function* f2) {
  basic_block bb1;
  basic_block bb2;

  // Create vector of gimple to store statements
  std::vector < gimple * > stmts1, stmts2;

  // Collect all statements from function 1
  FOR_EACH_BB_FN(bb1, f1) {
    for (gimple_stmt_iterator gsi = gsi_start_bb(bb1); !gsi_end_p(gsi); gsi_next( & gsi)) {
      stmts1.push_back(gsi_stmt(gsi));
    }
  }

  // Collect all statements from function 2
  FOR_EACH_BB_FN(bb2, f2) {
    for (gimple_stmt_iterator gsi = gsi_start_bb(bb2); !gsi_end_p(gsi); gsi_next( & gsi)) {
      stmts2.push_back(gsi_stmt(gsi));
    }
  }

  // Compare rhe statement counts
  if (stmts1.size() != stmts2.size()) {
    fprintf(dump_file, "Statement count mismatch: %zu vs %zu\n", stmts1.size(), stmts2.size());
    return false;
  }

  // Compare GIMPLE codes and operand counts
  for (size_t i = 0; i < stmts1.size(); ++i) {
    gimple * s1 = stmts1[i];
    gimple * s2 = stmts2[i];

    enum gimple_code code1 = gimple_code(s1);
    enum gimple_code code2 = gimple_code(s2);
    unsigned ops1 = gimple_num_ops(s1);
    unsigned ops2 = gimple_num_ops(s2);

    if (code1 != code2) {
      fprintf(dump_file, "GIMPLE code mismatch at statement %zu:\n", i);
      fprintf(dump_file, "Function 1: ");
      print_gimple_stmt(dump_file, s1, 0, TDF_NONE);
      fprintf(dump_file, "\nFunction 2: ");
      print_gimple_stmt(dump_file, s2, 0, TDF_NONE);
      fprintf(dump_file, "\n");
      return false;
    }

    if (ops1 != ops2) {
      fprintf(dump_file, "Operand count mismatch at statement %zu:\n", i);
      fprintf(dump_file, "Function 1: %u operands, Function 2: %u operands\n", ops1, ops2);
      return false;
    }
  }
  return true;
}

This concludes Phase 2 of this logic, and the only phase remaining is to output a statement indicating whether to PRUNE or NOPRUNE.

Phase 3 - Output PRUNE or NOPRUNE

The code for this was, hands down, the easiest part of the project (Thankfully). It’s a simple if statement that checks my identical flag.

// Set a flag
bool identical = true;

// Final flag set
if (basicBlockCountForFunctionTwo != basicBlockCountForFunctionOne || gimpleCountForFunctionTwo != gimpleCountForFunctionOne || !gimpleCodeAndOpsCheck) {
  identical = false;
}

// Now just print the PRUNE / NOPRUNE MESSAGE
if (identical) {
  fprintf(dump_file, "PRUNE: %s\n", baseName.c_str());
} else {
  fprintf(dump_file, "NOPRUNE: %s\n", baseName.c_str());
}

Final Output - Testing on AArch64

I developed this pass and first added it into an AArch64 system. Therefore I am going to test it there first.

Testing Instructions

You've completed the development of your compiler pass and are now ready to test it on an AArch64 system first.

To test, go to the gcc-build-001 directory and run these commands:

time make -j$(nproc) |& tee third_build_log_analyzing_cloned_variants_final.log
make install

Next, unzip the test directory from /public/spo600-test-clone.tgz, then:

Change to the test directory: cd ~/spo600/examples/test-clone
Run make clean
Run make

To easily find your specific output files, use:

ls *kbshah6

This command will display just two files that end with your pass name.

[kbshah6@aarch64-002 test-clone]$ ls *kbshah6
clone-test-aarch64-noprune-clone-test-core.c.265t.kbshah6  clone-test-aarch64-prune-clone-test-core.c.265t.kbshah6

Output for NOPRUNE - AArch64

I simply executed the command: cat clone-test-aarch64-noprune-clone-test-core.c.265t.kbshah6 and I received the following output:

[kbshah6@aarch64-002 test-clone]$ cat clone-test-aarch64-noprune-clone-test-core.c.265t.kbshah6

;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)

--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.sve2 (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------

------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
 -------->  Variant: scale_samples.sve2
 -------->  Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0xffff8a863a90,  0xffff8a4d7ea0 

---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.sve2) Block Count 5
Function 2: (scale_samples) Block Count 32
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 130

---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.sve2) Gimple Count 24
Function 2: (scale_samples) Gimple Count 130
---------------------------------------------------------
Statement count mismatch: 24 vs 130
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples

Looking at the last line it correctly identifies that there is NOPRUNE. However, unfortunately this is not the case. You will see in the PRUNE example what I mean by that.

Output for PRUNE - AArch64

[kbshah6@aarch64-002 test-clone]$ cat clone-test-aarch64-prune-clone-test-core.c.265t.kbshah6

;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)

--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.rng (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------

------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
 -------->  Variant: scale_samples.rng
 -------->  Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0xffff9254ba90,  0xffff921bfea0 

---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.rng) Block Count 5
Function 2: (scale_samples) Block Count 32
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 130

---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.rng) Gimple Count 24
Function 2: (scale_samples) Gimple Count 130
---------------------------------------------------------
Statement count mismatch: 24 vs 130
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples

My comparison functions obviously have a problem, as you can see. As you can see from the output, I am certain that my detecting cloned logic functions. However, my comparison must be flawed in some way. I'll talk to Professor Chris Tyler about this because I tried for a few hours but was unable to determine the cause. Although this is new to me, it makes logical sense to me.

Nevertheless, I would like to end this piece by stating that even though this phase was really challenging, I believe I gained a lot of knowledge. My knowledge of AFMV, fundamental block identification, gimple statement counts, etc., has improved. These are the foundational ideas for creating something significant. I do have a lot to learn from the regrettable output above, and I hope to resolve the problem.

Final Output - Testing on x86

I completed the same steps above for x86 and listed the output on the terminal below:

Output for NOPRUNE - x86

kbshah6@x86-001:~/spo600/examples/test-clone$ cat clone-test-x86-noprune-clone-test-core.c.265t.kbshah6

;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)

--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.arch_x86_64_v3 (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------

------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
 -------->  Variant: scale_samples.arch_x86_64_v3
 -------->  Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0x7effa3c1b750,  0x7effa3cde750 

---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.arch_x86_64_v3) Block Count 5
Function 2: (scale_samples) Block Count 31
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 165

---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.arch_x86_64_v3) Gimple Count 24
Function 2: (scale_samples) Gimple Count 165
---------------------------------------------------------
Statement count mismatch: 24 vs 165
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples

PRUNE Output for AArch64

kbshah6@x86-001:~/spo600/examples/test-clone$ cat clone-test-x86-prune-clone-test-core.c.265t.kbshah6

;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)

--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.popcnt (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------

------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
 -------->  Variant: scale_samples.popcnt
 -------->  Variant: scale_samples.default
-----------------------------------------------------------
Successfully retrieved both function 1 and two, the addresses are as follows: 0x7fc46541b750,  0x7fc4654de750 

---------------------------------------------------------
----------------- Basic Block Counts ---------------------
Function 1: (scale_samples.popcnt) Block Count 5
Function 2: (scale_samples) Block Count 31
---------------------------------------------------------
Number of gimple statements is 24
Number of gimple statements is 165

---------------------------------------------------------
----------------- GIMPLE Counts ---------------------
Function 1: (scale_samples.popcnt) Gimple Count 24
Function 2: (scale_samples) Gimple Count 165
---------------------------------------------------------
Statement count mismatch: 24 vs 165
0 for gimpleCodeAndOpsCheck. (1 means pass, 0 means fail)
NOPRUNE: scale_samples

Once again, there is something wrong with my logic… [ :( ]. I did try really hard to debug this but couldn’t fix it, unfortunately.

Next Steps

My next steps include discussing my solution with Professor Chris Tyler and fixing my solution with some assistance before I engage in the last phase.

Credit & References

For this project stage, I would want to give Professor Chris Tyler credit for his excellent explanations during the previous few weeks' lecture sessions. For more information on this project and GCC passes in general, I strongly advise everyone to click the following link. Although it's difficult, I promise you will gain knowledge from it. This is, without a doubt, one of Seneca College's most distinctive courses.

Search This Blog

SPO600

Project Stage 2: Project Update & Comparing Cloned Functions