Using Bsearch For Fast Table Lookup

The bsearch() method available in 3ds Max 2009 and higher allows for very fast table lookup operations in sorted arrays.

The problem

Assume that you have an array of arrays where each sub-array contains the same number of elements with the same layout. In many cases, you will collect this information just once and then retrieve the data stored in a specific sub-array ("record") by providing a single search value ("key") that represents the whole record without having to go through all objects in the scene again.

Real-World Examples

An example is a list of all Editable Poly objects in the scene including their name, vertex, and poly count as well as the number of triangles, quad polygons, and n-gons in their geometry. As the collection of the polygon sides data is relatively slow (it requires looping through all polygons and determining their degree), you can collect this data just once and then be able to quickly ask "How many tris and quads does the object with the name "Box25" contain?"

Using a user-made MAXScript function to solve this problem will be relatively slow. It will require a linear search through the array (a loop) where the search key (in the above example, it is the object's name) is compared with the name stored in each sub-array of the table. If we assume that each entry is unique, then we can stop searching when we find a matching key. With this assumption the linear search will be very fast if the object in question happens to be the first record in the array, but much slower if it happens to be the last record. The average search time will be somewhere in between.

LINEAR SEARCH EXAMPLE:

Below is one of the possible linear search functions used later on this page to compete against bsearch(). In short, the search key and the table to search is passed and it loops through them until it finds a match. At that point, it exits the loop and returns the current value.

   fn LookupMAXScriptLoop3 itemKey itemTable = (
    for i in itemTable do if i[1] == itemKey do exit with i
   ) 

A possible acceleration of the MAXScript function search involves the creation of a second array containing only the keys from the original table and applying the findItem() method to it to get the index of the record. While this is technically also a form of linear search, findItem() is implemented in C++ and is much faster than a MAXScript For loop.

FINDITEM SEARCH EXAMPLE:

Here, we pass the key, the table to search, and a table of all keys in the same order as the search table. FindItem() returns 0 if the key does not exist in the list or a positive index if it does. Using that index, we can return the record with the given key:

   fn LookupMAXScriptFindItem itemKey itemTable keyTable =(
   local theIndex = findItem keyTable itemKey
   if theIndex > 0 then itemTable[theIndex] else undefined
   )

Using bsearch() To Get There FAST

The bsearch() function solves the problem by working on a table that is already pre-sorted by the search key using qsort. It applies the same comparison function used for sorting to searching. As you see below, bsearch() scales great as the size of the table increases and is independent of the position of the record inside the table. It needs the same time to find the first, middle, or last record.

The following test script can be used to test the bsearch() method against various variations of MAXScript functions.

TEST SCRIPT:

   (
   -- Same compare function used by bsearch and qsort:
   fn LookupTableComparator a b = (
   if a[1] > b[1] then 1
   else if a[1] < b[1] then -1
   else 0
   )

   -- Finds an item in a lookup table; -- A lookup table is a sorted array of items, -- where each item is an array with its key as the same entry as the sort key. -- In our case the LookupTableComparator uses the first element as the key --So we have to make the search key an array too --in order to make it compatible with the qsort function!
   fn LookupTableLookup itemKey itemTable = (
   local lookupKey = #(itemKey)
   bsearch lookupKey itemTable LookupTableComparator
   )

   --MAXScript Function1 -collects all matches and returns the first, if any
   fn LookupMAXScriptLoop1 itemKey itemTable = (
    result = for i in itemTable where i[1] == itemKey collect i
   if result.count > 0 then result[1] else undefined
   )

   --MAXScript Function2 -loops until a match is found and returns the result, if any
   fn LookupMAXScriptLoop2 itemKey itemTable = (
   for i in itemTable do if i[1] == itemKey do return i
   undefined
   )

   --MAXScript Function3- loops until a match is found and exists with the result, if any
   fn LookupMAXScriptLoop3 itemKey itemTable = (
   for i in itemTable do if i[1] == itemKey do exit with i
   )

   --MAXScript Function loops using a while - managing the i var. is slower!
   fn LookupMAXScriptLoop4 itemKey itemTable = (
   local i = 0
   result = undefined
   while i < itemTable.count do (
   i+=1
   if itemTable[i][1] == itemKey do (
   result = itemTable[i]
   i = itemTable.count + 1
   )
   )
   result
   )
   fn LookupMAXScriptFindItem itemKey itemTable keyTable = (
     local theIndex = findItem keyTable itemKey
     if theIndex > 0 then itemTable[theIndex] else undefined
   )

   --TEST VALUES:
   local searchKey ="5498"--the key to search for
   local iterations = 10--number of search calls
   local tableElements = 10000--number of elements in the table

   --Create an array to search in. --Its elements will have the form #("NNN", "TestNNN", NNN). 
   --But the table could contain ANYTHING, this is just an example
   searchTable =for i = 1 to tableElements collect
   #( (i as string), "Test"+(i as string), i)

   --For BSearch to work, the search table has to be sorted array!
   st = timestamp()
   qsort searchTable LookupTableComparator
   format "QSort Time: %ms\n" (timestamp()-st)

   --Print the current first, middle and last element for best, mean and worst case searches
   format "First Element : %\n" searchTable[1]
   format "Mid. Element : %\n" searchTable[tableElements/2] 
   format "Last Element : %\n" searchTable[tableElements]

   local returnValue--variable to show the result

   local st =timestamp()
   for i = 1 to iterations do returnValue = LookupMAXScriptLoop1 searchKey searchTable
   local totalTime =(timestamp()-st)
   format "MXS Loop 1 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
   format "%\n" returnValue

   st =timestamp()
   for i = 1 to iterations do returnValue = LookupMAXScriptLoop2 searchKey searchTable
   local totalTime =(timestamp()-st)
   format "MXS Loop 2 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
   format "%\n" returnValue

   st =timestamp()
   for i = 1 to iterations do returnValue = LookupMAXScriptLoop3 searchKey searchTable
   local totalTime =(timestamp()-st)
   format "MXS Loop 3 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
   format "%\n" returnValue

   st =timestamp()
   for i = 1 to iterations do returnValue = LookupMAXScriptLoop4 searchKey searchTable
   local totalTime =(timestamp()-st)
   format "MXS Loop 4 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
   format "%\n"returnValue 

   local keyTable = (for i in searchTable collect i[1]) --collect all keys
   local st =timestamp()
   for i = 1 to iterations do
   returnValue = LookupMAXScriptFindItem searchKey searchTable keyTable
   local totalTime =(timestamp()-st)
   format "FindItem - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
   format "%\n"returnValue 

   st =timestamp()
   for i = 1 to iterations do returnValue = LookupTableLookup searchKey searchTable
   local totalTime =(timestamp()-st)
   format "BSEARCH - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
   format "%\n"returnValue 
   )

OUTPUT:

   QSort Time: 375ms
   First Element : #("1", "Test1", 1)
   Mid. Element : #("5498", "Test5498", 5498)
   Last Element : #("9999", "Test9999", 9999)
   MXS Loop 1 - Total: 18812ms Per Search: 18.812ms
   #("5498", "Test5498", 5498)
   MXS Loop 2 - Total: 9750ms Per Search: 9.75ms
   #("5498", "Test5498", 5498)
   MXS Loop 3 - Total: 9594ms Per Search: 9.594ms
   #("5498", "Test5498", 5498)
   MXS Loop 4 - Total: 18609ms Per Search: 18.609ms
   #("5498", "Test5498", 5498)
   FindItem - Total: 1812ms Per Search: 1.812ms
   #("5498", "Test5498", 5498)
   BSEARCH - Total: 0ms Per Search: 0.0ms
   #("5498", "Test5498", 5498)
   OK

In a table of 10,000 records sorted using qsort, key "5498" is exactly in the middle of the sorted table.

The MXS Loop 1 performs as bad as in the worst case scenario of "9999" because it has to always loop through all elements.

The MXS Loop 2 and 3 functions finish in half the time because they break in the middle of the loop. If we were searching for "9999", their results will be close to those of MXS Loop 1. But, if we were searching for key "1", they will be very fast because it is the first thing they hit when testing.

The FindItem approach proves to be an order of magnitude faster.

Finally, the BSEARCH method is so fast that we cannot even stop its time (less than 1 millisecond for 1000 calls).

The following table shows the per search times with different table sizes and the best, mean, and worst cases for all methods. A large number of iterations (between 1000 and 100000 depending on the number of records) were used to ensure high precision.

Elements Key Loop 1 Loop 2 Loop 3 Loop 4 FindItem Bsearch
10 "1" 0.01734ms 0.06266ms 0.04093ms 0.00813ms 0.00375ms 0.01266ms
"4" 0.01719ms 0.06781ms 0.04562ms 0.01844ms 0.00421ms 0.00813ms
"9" 0.01657ms 0.07438ms 0.05156ms 0.03015ms 0.00469ms 0.01062ms
100 "1" 0.13031ms 0.06437ms 0.04078ms 0.00797ms 0.00406ms 0.02032ms
"53" 0.12922ms 0.12297ms 0.10454ms 0.13016ms 0.00829ms 0.00812ms
"99" 0.12984ms 0.18688ms 0.16859ms 0.25812ms 0.01329ms 0.01546ms
1000 "1" 1.3516ms 0.0609ms 0.0391ms 0.0078ms 0.0031ms 0.0328ms
"548" 1.4187ms 0.7156ms 0.7000ms 1.3000ms 0.0547ms 0.0078ms
"999" 1.3875ms 1.4250ms 1.4156ms 2.7469ms 0.1063ms 0.0187ms
10000 "1" 17.344ms 0.0620ms 0.0470ms 0.0160ms 0.0160ms 0.0470ms
"5480" 17.250ms 8.8590ms 8.7970ms 20.187ms 1.6090ms 0.0310ms
"9999" 18.641ms 18.531ms 18.625ms 40.375ms 3.6870ms 0.0310ms
100000 "1" 191.56ms 0.0630ms 0.0470ms 0.2650ms 0.0150ms 0.5940ms
"54998" 194.38ms 97.180ms 96.570ms 3172.0ms 19.530ms 0.0150ms
"99999" 187.34ms 187.03ms 188.59ms 5659.22ms 38.750ms 0.2660ms

As you can see from the table, the two best performers are the findItem and bsearch methods. With very few records in the array, findItem performs a bit faster.

As the number of records in the array increases, the bsearch approach wins in both the average (searching for the middle record) and worst case scenarios (searching for the last record), while the linear search of findItem only wins in the best case scenario. Thus, if you have a very large list of records, for example 100,000, finding a random record by its key is 1300 times faster using bsearch() than by using findItem() and over 6400 times faster than the fastest of the MAXScript Linear Search (Loop) functions.

The above times do not take into account the time necessary to sort the table using qsort() for bsearch() to be able to use, nor the time necessary to collect a keys array for the findItem() method. We assume that these are performed in a separate step once and then a very large number of random searches have to be performed very fast. The time needed to sort 100,000 records using qsort is around 30 seconds, which equals to about 800 calls to the findItem function. Thus, if you know that you will be calling the search function thousands of times, spending 30 seconds to pre-sort the list for bsearch() use is worth it and will pay off due to the much shorter search times later on.

Previous Tip

Optimizing Particle Flow Script Operator For Loops For Speed