Monday, March 14, 2011

Find all subsets of a given set

If we're given a set of integers such that S = {1, 2, 3}, how can we find all the subsets of that set? For example, given S, the subsets are {}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, and {1, 2, 3}.

To solve this problem there are two techniques that we need to understand:

  1. Determine the number of subsets using bit strings: we know that the number of subsets equals to 2 to the power of the number of elements in the superset: #subsets = 2 ^ #elements. For instance, if our superset is S = {1, 2, 3}, then it has 2^3 = 8 subsets. If our superset has four elements then it has 2^4 = 16 subsets. Moreover, by shifting the bit representation of the number 1 by n, we also get 2^n. Thus, if we shift the bit string of 1 by the number of elements in the superset, we'll get the number of subsets for that superset. For example, if we have S = {1, 2, 3}, then there are 1 << 3 = 2^3 subsets in S.
  2. Keeping track of data using bit manipulation: for this problem, we will use a bit string to keep track of subsets and their elements. Take S = {1, 2, 3} as an example. If the bit string is 100 then the subset we're working on has only one element which is 1. If the bit string is 110 then the subset we're working on has two elements which are 1 and 2. We will use & and << bit operators to figure out which bit is 1 in the bit string.

If you are not familiar with bit manipulation please read "Bit manipulation tips & tricks" part 1 and part 2. Here is the code for the algorithm written in Java:

private static void findSubsets(int array[])
{
  int numOfSubsets = 1 << array.length; 

  for(int i = 0; i < numOfSubsets; i++)
 {
    int pos = array.length - 1;
   int bitmask = i;

   System.out.print("{");
   while(bitmask > 0)
   {
    if((bitmask & 1) == 1)
     System.out.print(array[pos]+",");
    bitmask >>= 1;
    pos--;
   }
   System.out.print("}");
 }
}

Code explanation:

  1. First, we find out the number of subsets that the superset has by shifting the bit representation of 1 by the number of elements in the superset.
  2. Next, we just loop through each subset and generate its elements accordingly.
  3. Inside the loop, we use pos and bitmask to keep track of the element. Specifically, bitmask is the bit string that represents elements in the current subset. And, we use post to retrieve the correct element from the superset.
  4. The while loop will add the correct element to the subset. Note that bitmask & 1 equals to 1 only when bitmask has a '1' bit at the last position. For example, bitmask = "001" or "011" will make bitmask & 1 equal to 1. That's when we'll add an element into the subset. Why does it work? Well, for each iteration of the while loop, we'll shift bitmask by one bit position to the right (bitmask >>= 1) and we decrement pos by 1 (pos--). Thus, whenever there is a '1' bit at the last bit position, we know exactly which element to add into the subset. If you are confused, don't worry because we'll do an example!
  5. After finishing one subset, we go to a new line and continue processing the rest.

Example: supposed that we have S = {1, 2} as our superset, then numOfSubsets = 0001 << 2 = 0100 = 4 (subsets). The rest of the program runs like this:

  1. i = 0, pos = 1, and bitmask = 0. First while loop iteration, bitmask & 1 = 0, bitmask >>= 1 = 0, and pos = 0. While loop exits because bitmask = 0.
  2. i = 1, pos = 1, and bitmask = 1. First while loop iteration, bitmask & 1 = 1, so print out array[pos] = 2. bitmask >>= 1 = 0 and pos = 0. While loop exits because bitmask = 0.
  3. i = 2, pos = 1, and bitmask = 2. First while loop, bitmask & 1 = 0, so print nothing. bitmask >>= 1 = 1 and pos = 0. Second while loop, bitmask & 1 = 1, so print out array[pos] = 1. bitmask >>= 1 = 0 and pos = -1. While loop exits because bitmask = 0.
  4. i = 3, pos = 1, and bitmask = 3. First while loop, bitmask & 1 = 1, so print out array[pos] = 2. bitmask >>= 1 = 1 and pos = 0. Second while loop, bitmask & 1 = 1, so print array[pos] = 1. bitmask >>= 1 = 0, and pos = -1. While loop exits because bitmask = 0.

The result is 4 subsets: {}, {2}, {1} and {2, 1}

This algorithm can be challenging to understand if you don't have any knowledge about bit manipulation. So, if you have a hard time making sense of it, try to read about bit manipulation first! That will help a lot! Well, thanks for reading and until next time.

17 comments:

  1. Great! Thank you so much, it really helped me out

    ReplyDelete
  2. good algorithm for entry-level java programmer as algorithm interview question

    ReplyDelete
  3. thanks .i help to recall the idea

    ReplyDelete
  4. Wow that's really cool, thank you :-)

    ReplyDelete
  5. What will be the time complexity of this algorithm?

    ReplyDelete
  6. super cool.
    Simple combination nC0+nC1+...+nCn = 2^n
    And visualize all the nos between 0 - 2^n as binaries

    ReplyDelete
  7. that's a nice explanation...Thanks a lot..!!

    ReplyDelete
  8. How can you solve the problem if number of sets is greater than sizeof(int)

    ReplyDelete
  9. Doesn't that leave a stray comma at the end of the subset?

    while(bitmask > 0)
    {
    if((bitmask & 1) == 1)
    System.out.print(array[pos]);
    bitmask >>= 1;
    if(bitmask) {
    System.out.print(",");
    }
    pos--;
    }
    System.out.print("}");

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. This comment has been removed by the author.

      Delete
  10. Thanks!!
    great explanation ...keep this blog going

    ReplyDelete
  11. Thanks. Simple logic whether you want an element in your subset or not, if bit is set yes, otherwise not. 2^n iterations give all subsets.

    ReplyDelete