About

Report Name

 

      On General Binary Search SQLi Algorithm Theory   

 

Report Author

 

     Bedirhan Urgun, urgunb at hotmail dot com, www.webguvenligi.org

 

Report Date

 

      February 2008

General Binary Search Algorithm

The Theory

 

What interests us is the rate of growth / order of growth of the running time. And the running time of an algorithm on a particular input is the number of primitive operations or "steps" executed. It is convenient to define the notion of step so that it is as machine-independent as possible. In our case "step" is # of requests.

In RAM (Random Access Machine) model a step is after another, in other words there is no concurrency.

Theta(g(n)) = {f(n): for constants c0, c1, n0;  0<=c1.g(n) <= f(n) <= c2.g(n) for all n>=n0}

Analysis

 

Let db record's length be fetched be n and the function of # of requests produced by the bisection algorithm be f(n).

 

For a single character, # of requests can be bounded between;

 

1 <= {# of requests to find a single character} <= 7

 

The best case, which is the first try being a hit, is 1 and the worst case is 7, which is the last try being a hit or none for that matter. For a whole data (db record) of length n this becomes;

 

1.n <= f(n) <= 7.n,

0 <= 1.n <= f(n) <= 7.n for n >= 1, where c0 = 1, c1 = 7, n0 = 1

as such Theta(n) = f(n), which basically says the order of growth of the tools' running time (utilizing bisection algorithm) is linear in terms of the length of the data to be fecthed. 

 

Algorithms (Pseudocode)

 

1.

int search(c0, c1){

     p = (c1 - c0)/2 + c0

     if(c0 == p)

         return c0

     r = query(p) // tests if (x < p) condition where x is the current character's ascii value

     if(r)

         return search(c0,p)

     else

         return search(p,c1)

}

2.

 

int search(c0, c1){

     p = (c1 - c0)/2 + c0

     r = query1(p) // tests if (x < p) condition where x is the current character's ascii value

     if(r)

         return search(c0,p)

     else{

         r = query2(p) // tests if (x > p) condition where x is the current character's ascii value

         if(r)

             return search(p,c1)

         else

             return p;

     }

}

 3.

 

 ...

 

Optimizing the Binary Search Algorithm in Blind SQL Injection

 

Theta(n) also shows that the algorithm used is asymptotically optimal. The complexity for even the "mappable blind sql injection", which resembles B-trees, is Theta(n).

 

 

To reduce the searching operation, a binary search tree should be height-balanced (self-balancing or AVL), which is keeping number of levels after the root element at a bare minimum. A tool (blind sql injector) has to create a balanced search tree, however, for performance reasons the tree may not strictly be height-balanced. It may be more efficient to keep a custom-balanced binary search tree. For readable ASCII domain comprises 32 to 126 decimal. If we can use uppercase function and deprioritize {, |, } and ~ characters, then we can proiritize the 65-90 (A-Z) range when creating our binary search tree.

 

Other characters (32-64 and 91-126) will be attached as the left child of 65 and right child of 90, respectively. We can even remove 97-122 (lower case characters) from the 91-126 domain.  So, when for the majority of the data being alphabet characters, we can reduce the # of requests. For the algorithm number 1 above, most of the alphabet characters, except 90 and 65, # requests will be 5, instead of 7. On the other hand, implementing this optimization will be harder.

There may of course other custom-balanced binary search trees exist in the tool for numerical heavy data and classes alike. Moreover, the tool can switch between custom-balanced binary search trees according to miss rates increasing for the selected binary search tree in use. 

Side Note

How do tools understand the Enf Of Input?

    1.1 extra character requests (sqlmap)
    1.2 checking against NULL (bsql)
    1.3 calculating LENGTH  and acting accordingly (sqlix, absinthe)


References:

[1] Introduction to Algorithms; Cormen, Leiserson and Rivest

[2] http://webpages.ull.es/users/jriera/Docencia/AVL/AVL%20tree%20applet.htm