Generating String Combinations of Characters in Java

In my post here, I shared an algorithm to generate all possible combinations of a given set of characters for a specified range of lengths. In this post, I shall share a code sample written in Java that implements the algorithm shared before. It does not use any goto keyword and hence does not create sphagetti code.

The code uses the concurrency API of Java and is fully parallelized. It maxes out the CPU cores of a multi-core CPU and achieves a few million iterations per second. The code sample uses a custom utility function ‘power’ for calculating powers of integers. It is made static for more execution speed.
The code is commented in places to help readers understand it.

As with the algorithm, I take no responsibility for what is done with this code.

package stringgenerator;

import java.util.concurrent.*;
import java.util.concurrent.atomic.*;

public class StringGenerator implements Runnable
{
  private static char[] dict = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'};  //allowed characters
  
  private static ExecutorService ex;
  
  private static AtomicLong attempts = new AtomicLong(0); //count the number of combinations generated
  
  private static Object locker = new Object();
  
  private static long maxAttempts = 0; //total possible combinations of characters of specified lengths
  
  private static int maxLen = 0;
  
  private static long getMaxAttempts(int radix, int maxLen)
  {
    maxAttempts = (power(radix, maxLen) - 1)*radix /(radix-1);
    
    return maxAttempts;
  }
  
  //utility function to calculate powers of integers
  private static long power(int base, int exp)
  {
    if(exp < 0)
    {
      return 1/power(base, -exp);
    }
    else if(exp == 0)
    {
      return 1;
    }
    else{
      long pow = 1;
      for(int p = 1; p <= exp; p++)
      {
        pow*=base;
      }
      
      return pow;
    }
  }
  
  //the main work horse of this program, breaking all possible numbers into combinations
  private static void resolveRadix(int len, long num)
  {
    long[] retVal = new long[len];
    
    int pow = 0;
    int i = -1;

    int radix = len;
    
    for(i = 0; power(radix ,i ) <= num; i++ );
    
    for(int j = 0; j < len - i; j++)
    {
      retVal[j] = 0;
    }
    
    for(int k = len - i; k < len; k++)
    {
      long t = power(radix, i - 1);
      if(i > 1)
      {
      
        retVal[k] = (long)((num - (num % t)) / t);
      
      }
      else
      {
        retVal[k] = (num % radix);
      }
      num -= (retVal[k] * t);
      
      i--;
    }
    
    char[] combo = new char[len];
    
    for(int j = 0; j < len; j++)
    {
      combo[j] = dict[(int)retVal[j]];
    }
    
    String str = new String(combo);
    
    final long attemptsVal = attempts.incrementAndGet();
    
    attempts.set(attemptsVal);
  }
  
  private void generate()
  {
    for(int length = 1; length <= maxLen; length++)
    {
      int radix = dict.length;
      
      long max = (long)power(radix, length);

      for(int i=0;i<max;i++)
      {
        resolveRadix(radix, i);
      }
    }
    
  }
  
  public void run()
  {
    generate();
    
    synchronized(locker)
    {
      locker.notify();
    }
      
    ex.shutdownNow();
    
  }
  
  public static void main(String[] args)
  {
    
    ex = Executors.newSingleThreadExecutor();
    
    maxLen = 6;
    
    long atts = getMaxAttempts(dict.length, maxLen);
    
    StringGenerator threads = new StringGenerator();
    
    long start = System.currentTimeMillis();  //execution start time
    
    ex.execute(threads); //execute code parallelly
    
    try{
        
      synchronized(locker)
      {
        locker.wait();
      }
    }
    catch(InterruptedException ie)
    {
      ie.printStackTrace();
    }
    
    long attemptsVal = attempts.get(); //final total number of attempts
    
    long end = System.currentTimeMillis(); //execution end time
    
    double speed = (double)((1000 * attemptsVal)/(end - start)); //rate of generating combos
    
    double seconds = (double)(end - start) / 1000;
    
    System.out.println("per sec = "+speed);
    
    System.out.println("sec = "+seconds);
    
    System.out.println("words = "+attemptsVal);
    
    System.out.println("max words = "+atts);
    
  }
}

Execution after compilation is shown below:

Algorithm Generating String Combinations of Characters

In this post, I shall reveal another algorithm to generate all possible combinations of a given set of characters,of any predetermined range of lengths. For example, if we take a,b,c as possible characters in our combinations, and 1 to 3 as the range of lengths, then we have the following output:

a
b
c
aa
ab
ac
ba
bb
bc
ca
cb
cc
aaa
aab
aac
aba
abb
abc
....
ccc

This algorithm merely generates the combinations as strings of characters. It is upto the user of this algorithm to decide what to do with the generated string. I take no responsibility for what a user might actually do with this algorithm.

It is a non-recursive algorithm and is highly efficient in terms of usage of memory. Being non-recursive, it avoids the pitfall of call stack overflow, even with a large input space.

As with my other algorithm similar to this one here, one needs to implement parallelized code to max out CPUs and achieve a very high number of iterations per second.

The only limitation is that the combinations generated have a large amount of repetition of characters, which is exacerbated when the specified range of lengths is high.

The algorithm is as follows:

1. take length of string from minLen to maxLen, 
   array of characters being considered as dictionary
2. take current length of string as len
3. take radix as length of character dictionary
4. take max as radix to the power of len
5. from num = 0 to num = max - 1, resolve radix len, num
6. resolve radix as follows
   a. take retVal as array
   b. find p = highest power of radix <= num
   c. from j = 0 to len - p - 1, retVal[j] = 0
   d. from k = len - p to len - 1,
      i. t = radix to the power p - 1
      ii. if p > 1, retVal[k] = (num - (num % t)) / t;
          else retVal[k] = (num % radix)
      iii. num -= (retVal[k] * t)
      iv. p--
   e. build string from retVal using 
      str = empty string
      foreach(retVal as r)
          concat (dictionary[r]) with str
   f. do something with str
7. if len = maxLen 
      quit
   else
      increment len by 1 and goto step 2

The cornerstone of this algorithm is the resolve radix subroutine which does the main work in it. It works similar to conversion of decimal numbers to binary numbers. However, the difference is that in each of the bits, instead of just 0 and 1, we can put numbers ranging from 0 to length of the set of possible characters minus one.

In our case, the character set is a,b,c. So the numbers fitting in the bits are 0,1 and 2,with 0 mapping to ‘a’, 1 mapping to ‘b’ and 2 mapping to ‘c’. The total size generated is equal to the current length. So in our example above, there will be three main iterations, with the total bit size of the ‘number’ ranging from 1 to 3.

So the bit combinations range from

0
1
2
00
01
02
10
11
12
20
21
22
000
001
002
….
222

Using our mapping rules above, we get a from 0, b from 1, c from 2, aa from 00, ab from 01, ac from 02, etc.

Let this be a Valentine’s Day gift for everyone!