Using Bsearch For Fast Table Lookup
The bsearch() method available in 3ds Max 2009 and higher allows for very fast table lookup operations in sorted arrays.
The problem
Let's assume you have an array of arrays where each sub-array contains the same number
of elements with the same layout. In many cases, you would like to collect this information
just once and then be able to retrieve the data stored in a specific sub-array ("record")
by providing a single search value ("key") that represents the whole record without
having to go through all objects in the scene again.
Real-World Examples
An example would be a list of all Editable Poly objects in the scene including their
name, vertex and poly count and, say, the number of triangles, quad polygons and n-gons
in their geometry. Since the collection of the polygon sides data is relatively slow
(it would require looping through all polygons and asking for their degree), you would
prefer to collect this data just once and then be able to quickly ask "How many tris
and quads does object with name "Box25" contain?"
Using MAXScript Functions for Linear Search
Using a user-made MAXScript function to solve this problem would be relatively slow.
It would require a LINEAR search through the array - in other words, a loop - where
the search key (in the above example, the object's name) is compared with the name
stored in every sub-array of the table. If we would assume that each entry is unique,
then we could stop searching the moment we find a matching key. Obviously, with this
assumption the linear search would be very fast if the object in question happens
to be the first record in the array, but much slower if it happens to be the last
record. The average search time would be somewhere in between.
LINEAR SEARCH EXAMPLE:
|
Below is one of the possible linear search functions used later on this page to compete
against bsearch(). In short, it is passed the search key and the table to search in
and loops through it until it finds a match. At that point, it exits the loop and
returns the current value.
|
fn LookupMAXScriptLoop3 itemKey itemTable = (
for i in itemTable do if i[1] == itemKey do exit with i
)
|
Using Find Item To Speed Up MAXScript Search
A possible acceleration of the MAXScript function search would involve the creation
of a second array containing only the keys from the original table and applying the
findItem() method to it to get the index of the record. While this is technically
also a form of linear search, findItem() is implemented in C++ and is much faster
than a MAXScript FOR loop.
FINDITEM SEARCH EXAMPLE:
|
Here we pass the key, the table to search in and a table of all keys in the same order
as the search table. FindItem() returns 0 if the key does not exist in the list or
a positive index if it does. Using that index, we can return the record with the given
key:
|
fn LookupMAXScriptFindItem itemKey itemTable keyTable =(
local theIndex = findItem keyTable itemKey
if theIndex > 0 then itemTable[theIndex] else undefined
)
|
Using bsearch() To Get There FAST!
The bsearch() function solves the problem by working on a table that is already pre-sorted
by the search key using qsort. It applies the same comparison function used for sorting
to the searching. As you will see below, bsearch() scales great as the size of the
table increases and is independent of the position of the record inside the table
- it needs the SAME AMOUNT OF TIME to find the first, the middle or the last record!
The following test script can be used to test the bsearch() method against various
variations of MAXScript functions.
TEST SCRIPT:
|
(
-- Same compare function used by bsearch and qsort:
fn LookupTableComparator a b = (
if a[1] > b[1] then 1
else if a[1] < b[1] then -1
else 0
)
-- Finds an item in a lookup table; -- A lookup table is a sorted array of items, -- where each item is an array with its key as the same entry as the sort key. -- In our case the LookupTableComparator uses the first element as the key --So we have to make the search key an array too --in order to make it compatible with the qsort function!
fn LookupTableLookup itemKey itemTable = (
local lookupKey = #(itemKey)
bsearch lookupKey itemTable LookupTableComparator
)
--MAXScript Function1 -collects all matches and returns the first, if any
fn LookupMAXScriptLoop1 itemKey itemTable = (
result = for i in itemTable where i[1] == itemKey collect i
if result.count > 0 then result[1] else undefined
)
--MAXScript Function2 -loops until a match is found and returns the result, if any
fn LookupMAXScriptLoop2 itemKey itemTable = (
for i in itemTable do if i[1] == itemKey do return i
undefined
)
--MAXScript Function3- loops until a match is found and exists with the result, if any
fn LookupMAXScriptLoop3 itemKey itemTable = (
for i in itemTable do if i[1] == itemKey do exit with i
)
--MAXScript Function loops using a while - managing the i var. is slower!
fn LookupMAXScriptLoop4 itemKey itemTable = (
local i = 0
result = undefined
while i < itemTable.count do (
i+=1
if itemTable[i][1] == itemKey do (
result = itemTable[i]
i = itemTable.count + 1
)
)
result
)
fn LookupMAXScriptFindItem itemKey itemTable keyTable = (
local theIndex = findItem keyTable itemKey
if theIndex > 0 then itemTable[theIndex] else undefined
)
--TEST VALUES:
local searchKey ="5498"--the key to search for
local iterations = 10--number of search calls
local tableElements = 10000--number of elements in the table
--Create an array to search in. --Its elements will have the form #("NNN", "TestNNN", NNN).
--But the table could contain ANYTHING, this is just an example
searchTable =for i = 1 to tableElements collect
#( (i as string), "Test"+(i as string), i)
--For BSearch to work, the search table has to be sorted array!
st = timestamp()
qsort searchTable LookupTableComparator
format "QSort Time: %ms\n" (timestamp()-st)
--Print the current first, middle and last element for best, mean and worst case searches
format "First Element : %\n" searchTable[1]
format "Mid. Element : %\n" searchTable[tableElements/2]
format "Last Element : %\n" searchTable[tableElements]
local returnValue--variable to show the result
local st =timestamp()
for i = 1 to iterations do returnValue = LookupMAXScriptLoop1 searchKey searchTable
local totalTime =(timestamp()-st)
format "MXS Loop 1 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
format "%\n" returnValue
st =timestamp()
for i = 1 to iterations do returnValue = LookupMAXScriptLoop2 searchKey searchTable
local totalTime =(timestamp()-st)
format "MXS Loop 2 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
format "%\n" returnValue
st =timestamp()
for i = 1 to iterations do returnValue = LookupMAXScriptLoop3 searchKey searchTable
local totalTime =(timestamp()-st)
format "MXS Loop 3 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
format "%\n" returnValue
st =timestamp()
for i = 1 to iterations do returnValue = LookupMAXScriptLoop4 searchKey searchTable
local totalTime =(timestamp()-st)
format "MXS Loop 4 - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
format "%\n"returnValue
local keyTable = (for i in searchTable collect i[1]) --collect all keys
local st =timestamp()
for i = 1 to iterations do
returnValue = LookupMAXScriptFindItem searchKey searchTable keyTable
local totalTime =(timestamp()-st)
format "FindItem - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
format "%\n"returnValue
st =timestamp()
for i = 1 to iterations do returnValue = LookupTableLookup searchKey searchTable
local totalTime =(timestamp()-st)
format "BSEARCH - Total: %ms Per Search: %ms\n" totalTime (1.0*totalTime/iterations)
format "%\n"returnValue
)
|
OUTPUT:
|
QSort Time: 375ms
First Element : #("1", "Test1", 1)
Mid. Element : #("5498", "Test5498", 5498)
Last Element : #("9999", "Test9999", 9999)
MXS Loop 1 - Total: 18812ms Per Search: 18.812ms
#("5498", "Test5498", 5498)
MXS Loop 2 - Total: 9750ms Per Search: 9.75ms
#("5498", "Test5498", 5498)
MXS Loop 3 - Total: 9594ms Per Search: 9.594ms
#("5498", "Test5498", 5498)
MXS Loop 4 - Total: 18609ms Per Search: 18.609ms
#("5498", "Test5498", 5498)
MXS FindItem - Total: 1812ms Per Search: 1.812ms
#("5498", "Test5498", 5498)
BSEARCH - Total: 0ms Per Search: 0.0ms
#("5498", "Test5498", 5498)
OK
|
In a table of 10,000 records sorted using qsort, key "5498" is exactly in the middle
of the sorted table.
As we can see, the MXS Loop 1 performs as bad as in the worst case scenario of "9999"
because it has to always loop through all elements.
The MXS Loop 2 and 3 functions finish in half the time because they break in the middle
of the loop. If we were searching for "9999", their results would be close to those
of MXS Loop 1. But if we were searching for key "1", they would be very fast because
it would be the first thing they would hit when testing.
The FindItem approach proves to be an order of magnitude faster.
Finally, the BSEARCH method is so fast we cannot even stop its time (less than 1 millisecond
for 1000 calls!)
The following table shows the per search times with different table sizes and the
best, mean and worst cases for all methods. A large number of iterations (between
1000 and 100000 depending on the number of records) were used to ensure high precision.
Elements
|
Key
|
Loop 1
|
Loop 2
|
Loop 3
|
Loop 4
|
FindItem
|
Bsearch
|
10
|
"1"
|
0.01734ms
|
0.06266ms
|
0.04093ms
|
0.00813ms
|
0.00375ms
|
0.01266ms
|
|
"4"
|
0.01719ms
|
0.06781ms
|
0.04562ms
|
0.01844ms
|
0.00421ms
|
0.00813ms
|
|
"9"
|
0.01657ms
|
0.07438ms
|
0.05156ms
|
0.03015ms
|
0.00469ms
|
0.01062ms
|
100
|
"1"
|
0.13031ms
|
0.06437ms
|
0.04078ms
|
0.00797ms
|
0.00406ms
|
0.02032ms
|
|
"53"
|
0.12922ms
|
0.12297ms
|
0.10454ms
|
0.13016ms
|
0.00829ms
|
0.00812ms
|
|
"99"
|
0.12984ms
|
0.18688ms
|
0.16859ms
|
0.25812ms
|
0.01329ms
|
0.01546ms
|
1000
|
"1"
|
1.3516ms
|
0.0609ms
|
0.0391ms
|
0.0078ms
|
0.0031ms
|
0.0328ms
|
|
"548"
|
1.4187ms
|
0.7156ms
|
0.7000ms
|
1.3000ms
|
0.0547ms
|
0.0078ms
|
|
"999"
|
1.3875ms
|
1.4250ms
|
1.4156ms
|
2.7469ms
|
0.1063ms
|
0.0187ms
|
10000
|
"1"
|
17.344ms
|
0.0620ms
|
0.0470ms
|
0.0160ms
|
0.0160ms
|
0.0470ms
|
|
"5480"
|
17.250ms
|
8.8590ms
|
8.7970ms
|
20.187ms
|
1.6090ms
|
0.0310ms
|
|
"9999"
|
18.641ms
|
18.531ms
|
18.625ms
|
40.375ms
|
3.6870ms
|
0.0310ms
|
100000
|
"1"
|
191.56ms
|
0.0630ms
|
0.0470ms
|
0.2650ms
|
0.0150ms
|
0.5940ms
|
|
"54998"
|
194.38ms
|
97.180ms
|
96.570ms
|
3172.0ms
|
19.530ms
|
0.0150ms
|
|
"99999"
|
187.34ms
|
187.03ms
|
188.59ms
|
5659.22ms
|
38.750ms
|
0.2660ms
|
As you can see from the table, the two best performers are the findItem and bsearch
methods. With very few records in the array, findItem perform a bit faster.
As the number of records in the array increases though, the bsearch approach wins
in both the average (searching for the middle record) and worst case scenarios (searching
for the last record), while the linear search of findItem only wins the best case
scenario. Thus, if you have a very large list of records, for example 100,000, finding
a random record by its key is 1300 times faster using bsearch() than by using findItem()
and over 6400 times faster than the fastest of the MAXScript Linear Search (Loop)
functions.
The above times do not take into account the time necessary to sort the table using
qsort() for bsearch() to be able touse, nor the time necessary to collect a keys array
for the findItem() method. We assume that these are performed in a separate step once
and then a very large number of random searches have to be performed very fast. The
time needed to sort 100,000 records using qsort is around 30 seconds, which equals
to about 800 calls to the findItem function. Thus, if you know that you will be calling
the search function thousands of times, spending 30 seconds to pre-sort the list for
bsearch() use is worth it and would pay off due to the much shorter search times later
on.
Previous Tip
Optimizing Particle Flow Script Operator For Loops For Speed