Sunday, May 8, 2011

Find Local Maximum of a Function Using Bisection Method

Question: implement the bisection method to find a function's local maximum.

Solution: bisection is one of the root-finding methods that are used to find real roots of a continuous function. More information about the method and mathematical analysis can be found here. For this question, we'll modify the bisection method to find the local maximum of a function instead of its roots.

The maximum of a function is a point where the value of the function is max. Thus, the local maximum is the maximum of a specified interval of the function. It is local because outside of the specified interval, we can't prove that it is the maximum of the function. Here is the code in C:

double bisect(double x_lower, double x_upper, double epsilon, double (*dx)(double), int maxIteration)
{
   double result = 0.0; //the root
   double f_result = 0.0; //value of f(x)
   int iteration = 0;

   while (fabs(x_upper - x_lower) > epsilon && iteration < maxIteration)
   {
      iteration++;

      //compute the new root
      result = (x_upper + x_lower) / 2;

      //derivative of the new root
      f_result = dx(result);
      
      //see if derivative has changed sign
      if (dx(x_lower) * f_result < 0)
         x_upper = result;
      else if (dx(x_upper) * f_result < 0)
         x_lower = result;
      else
      {
         return result;
      }
   }
   printf("No root found, returning -1 ! \n");
   return -1;
}

Explanation: our function accepts five arguments and return the maximum if it finds one.

  1. The first two arguments, x_lower and x_upper, are the lower and upper bound of the interval whose maximum is our interest.
  2. The third argument, epsilon is the accuracy or the acceptable range of error that the answer must meet. Why do we have this? Because in real life, it is extremely difficult to find the exact solution. Some reasons include the limit on computer's ability to store floating point numbers and round off error. Moreover, most real world applications only need the answer to be precise enough instead of being exact. That's why we accept the concept of tolerable error. To keep the error to minimum, I suggest passing in machine epsilon as the acceptable error. Machine epsilon is the smallest difference between two numbers that the computer, on which the program is run, can produce. Thus, if the error is less than machine epsilon, then the answer is the most precise that particular computer can produce.
  3. The fourth argument, *dx, is the pointer to the function's first derivative.
  4. the last argument, maxIteration, is the max number of times that we want our program to run. If the bisection method doesn't converge (it does in some cases), then our program will run forever. The maxIteration is there to prevent that case. After a certain number of calculations and the method has not returned the result, we better find another way to solve the problem right? :)

The basis to find the local maximum is that the derivative of lower and upper bounds have opposite signs (positive vs. negative). Furthermore, after passing through the maximum the derivative changes sign. Therefore, we can run the function until the derivative changes sign. It means that when neither dx(x_upper) * dx(x) < 0 nor dx(x_lower) * dx(x) < 0, then x is the value in interval (x_lower, x_upper) where f(x) is max.

Mathematically, the local maximum is the point where the derivative is 0 (dx = 0). Think about slope of a graph. However, since we are dealing with computer and floating point numbers, it's not a good idea to use equality comparison as the way to find maximum. That's why we exploit the fact that the derivative changes sign after passing through the maximum.

Thanks for reading and until next time.

Tuesday, May 3, 2011

Determine the Height of Binary Trees

Question: given a root of a binary tree, write an algorithm to find the tree's height.

Solution: the height of a binary tree is the length of the path from the root to the deepest leaf node of the tree. For example, the following tree has height of 3:

To find the height, we need to count the number of nodes on the path from root to the deepest leaf. This can be done recursively. Here is the code in C++:

//basic tree node definition
struct Node
{
  int data;
  struct Node* left;
  struct Node* right;
};

int getBinaryTreeHeight(struct Node* root)
{
   if (root == NULL)
      return 0;

   int leftHeight = getBinaryTreeHeight(root->left);
   int rightHeight = getBinaryTreeHeight(root->right);

   return 1 + (leftHeight <= rightHeight ? rightHeight : leftHeight);
}

Explanation: assuming that we have a tree node structure just like that in the example code, our method takes in a pointer to the root and return the height of that root's tree. Recursively, a tree's height is the height of its subtrees. So if we know the height of the subtrees, we know the height of the tree. For each node under the current root, we do the following:

  1. If the current node is null, we simply return 0 because a rooted tree with no root has height 0.
  2. Next we find the height of the left subtree and the right subtree.
  3. Lastly, either left or right subtree's height plus one is returned as the height of the current root's tree. Why do we add one into the return result? Well because the current node is counted as one height unit if it is not null.

By the end of the recursive calls, the returned sum is the height of the tree whose root is the argument to the function.

That's the end of this post. Thanks for reading :)