Working with Unity3D Engine – Decal Editor

I’m going to begin this blog entry by defining a decal and how/when its used in the development of graphics applications.

Decals are basically textures which vary in resolution. This special type of texture, unlike normal textures can be applied to any 2D/3D object with any orientation and at any size. This can be used to quickly add a lot of detail into the graphics application.

Decals come in two varieties, either static or dynamic. Static decals take longer to compute, but have a much more accurate result. This form of decals are normally pre-baked during the creation of a level. Dynamic decals on the other hand can be created at run-time. They normally sacrifice accuracy in favor of faster computation speed.

Over the years, numerous decal placement methods of arose. Almost all advanced game engines support decals in one form or another. The three most basic methods that game developers choose to adopt are as follows.

Quad Mesh – Possibly the simplest form of decal placement available in computer graphics. This method of placement places a square 3D mesh made from two triangles at the position and orientation of the target surface normal.

Quad Meshes are incredibly easy to code and place. Your average game programmer will have a system up and running based on quad meshes within minutes. Quad Meshes do come with a lot of problems, especially when placed on curved surfaces. As you can imagine, if you place a quad mesh on a curved surface such as a sphere, the decal edges become visible and would seem as if they were sticking out, running the immersion.

Subdivided Plane Mesh– This is a direct improved method over the quad mesh decal type. Instead of representing the texture in a single flat surface, this surface can be subdivided into smaller surfaces while interpolating the original texture UV coordinates in-between.

This allows the plane to morph or bend over the target surface, which increases in accuracy the more its subdivided. It is an incredibly fast method of decal placement over other methods, however it suffers from run time performance issues, as the more its subdivided, the greater the rendering overhead is.

meshDecal meshDecal2 meshDecal3

Mesh Intersection – Intersection based decals or more commonly known simply as “Projected Mesh” decals provide the most accurate decals regardless of the surface they are placed upon.

The Projection Cube is commonly represented with an Object Oriented Bounding Rectangle which is encapsulated by a tight fit Axis Aligned Bounding Rectangle. Simply put, any triangle which is inside the oriented rectangle is intersected with it. The intersection points along with any other points within the bounds are then triangulated to form a surface. This surface, along with new UV coordinates, normals and tangents is then used with a texture to create a decal.

I will not go into great detail of the intersection algorithms used. I wrote a thorough tutorial which can be found in the “Tutorials and References” section of this blog explains and provides source code of the various intersection algorithms.

Mesh Intersection decals suffer from heavy computation cost, however the provide arguably the most accurate surface in which a texture can be used. Because of this, the framework itself makes it perfect to be used with pre-baked static geometry, where detail can be added during editing to save on the computation costs during run-time.

The framework is also able to batch or merge common decals together, which means the run-time performance losses are minimal.

Mesh Intersection decals is what was developed alongside the slicer editor for our final year project. Depending on how things go, I may release it in the Unity Asset Store at the end of the year.

Pres5 Pres3 Pres4



I have included numerous screen shots of the various decal methods, which are directly from my prototype experimentation’s while developing the framework.

The Object Oriented Bounding Box/Rectangle intersection tests uses the SAT algorithm (Separating Axis Test) with 3D triangles/quads. More information can be found by simply googling SAT.


Working with Unity3D Engine – Slicer Editor

Ever since my final year of University started this year, I’ve been working with the Unity3D Engine as part of a group. This blog entry is dedicated to a tool I’m currently working on which allows the user to slice any 3D Convex object.

The Beginning

We began the year by working on a Unity Student project called “Awakening”. I was in charge of developing the AI for the project. As soon as the first semester ended I switched from the AI to developing other useful tools which could be used for the project. One such tool is the Slicer Editor which allows the user to either pre-slice before runtime, or slice objects during runtime any number of times.

The Technical

I began the work by quickly prototyping an existing solution based on cheating methods and analysing the runtime performance of those methods.

1st Method was a very naive implementation, which basically would create two objects of the same type and push vertices in opposite direction onto the cutting plane. Although a very fast solution, its disadvantages outweigh its advantages. A similar solution can be found at the Asset store now.

Advantages –

Very fast solution

No need for triangulation

Disadvantages –

Not aware of new vertex data, unable to create new UV coordinates

Edges of objects became round, some normals were offset which creates rendering artefacts

Doubled triangle count for each slice

2nd Method was more refined, however it created a new problem of how to triangulate the new set of data, and how to ensure normals were generated properly.

The triangulation for the Convex cutter is solved easily. When cutting a triangle, the point of intersection for every triangle can be assumed to lie on the plane defined by the said triangle. This reduces the problem from a 3D triangulation to a 2D triangulation. Being cut by a plane which extends from negative infinity to a positive infinity in all axes, we can assume that the point(s) of intersection along with the triangles original points are going to be convex in nature. Now it is a matter of determining which points of triangle lie on which side of the plane, and re-create new set of triangles using the new intersection points. I used Andrew’s monotone chain convex hull algorithm which has a time complexity of O(n log n).

Once the new triangles are created, we must find the new UV coordinates of the new vertices. Since we have the UV coordinates of the old triangle, we can use Barycentric Coordinates to determine the new UV coordinates of the intersection points.

Below are some screenshots of possible cuts of every triangle by a plane (2D) and possible ways to triangulate the new data.

No Intersection – The plane does not intersect the triangle anywhere, in this case, we define to which section of mesh the triangle belongs to, and we add as is onto the list.





One Intersection – The plane intersects the triangle directly on a vertex, which means the same test will eventually happen for the other side of the mesh. This is where floating point precision comes into play. We can either create an extremely small triangle (which can cause rendering artefacts) or simply treat it as a no intersection.





Two Intersection – The plane intersects the triangle in two places, splitting the triangle directly in two parts. This intersection type can either optimally generate two triangles (if cutting an edge and a vertex) or three triangles (if cutting an edge to an edge).





The intersection points are also added onto a separate list, which is then run through the convex hull algorithm to generate the closing face of both meshes.

Below are screenshots of the various cuts into a 3D rubicks cube.

A regular 45 degree cut of a custom cube which uses different UV coordinates per face



An irregular cut of a custom cube which uses different UV coordinates per face



Final words

The solution used at the moment is not very optimal, and unusable in real time with the exception of very basic shapes such as cubes. I am currently investigating a method of using Voronoi regions which seems to be faster and more efficient for real time shattering of objects. I am also investigating methods of offloading both the triangulation code and the intersection code onto the GPU, which will allow for real time slicing of complex objects.

Current bugs include triangles not generated in certain cases, which points to floating point errors or in-precisions. Other bugs include irregular normals on newly generated triangles which affects the overall lighting of the sliced objects.

The current prototype can be used to generate closed off Convex objects, however the closed off feature does not work on Concave shapes. Concave shape slicing works on the current prototype without the closed feature.

The link to our video of Awakening which showcases the prototype from Semester One of University.

My next blog entry will be about a decal framework I’ve been working on in Unity based off the Intersection code developed for the Object Slicer!

No Luck…

This year is my final year of University, and I’ve began applying to a number of places for graduate entry. In coming year, we will see where life takes me, until then its Game Development Time!

I’ve began the year by wrapping up some unfinished business with the “Halfway Game Engine”. The engine is coming along great, I have added a 3D renderer to it (finally) and am working on incorporating shader support as we speak. I’ve also knocked off most of the “todo” list, including making the broad-phase detectors more “generic” and adding support for culling (rendering) as well as real time ray tracing (collision/logic).  All that aside, despite all the work i’m placing in this software, I highly doubt I’ll ever make games with it in this lifetime or the next. Despite having said that, i’m still enjoying the problem solving process immensely.

While I wait for graduation, I’ve applied as a volunteer developer to a number of places. Amongst them is one of my old time favourite games “Wurm Online”. These positions will allow me to place everything I’ve learned and experimented with into real life problems and this will eventually be experience that will be welcomed in the industry.

There is no lying out of this, eventually I would like to create my own start-up games company however until then, I need the experience of other companies.

Short entry this time, my next blog entry will be more technical, featuring some runtime data on a new hybrid data structure I developed called the m-tree. So far it surpasses and solves many shot-coming problems from dynamic quad/octrees, however I’d like to confirm the visual integrity of the structure before I publish my findings.

For those curious, you may find “Wurm Online” at, you may find me on the “Release” server going under the name of DrHalfway!

Tagged , ,

Array Implementations

In large scale software, there is always a need for a storage module to be able to store variables and data in memory. Different storage methods have different advantages and disadvantages and it all comes down to how the software stores data, how it goes through data and how it removes data.

Lets take an example. A linked list is very good if adding and removing data (from the head or tail) very frequently, however a linked list has  a few short comings. Random access is O(n) complexity and data access itself is not very cache friendly.

Arrays have always been my favourite storage method and I’ve always pondered and tinkered different array implementations. Although the base problem of O(n) shifting is still required when removing from random positions. The implementation ideas I’m going to share today are for specific use cases only, and using them does require some understanding of the inner workings of the implementations.

So lets get down to business, I will attempt to describe all implementations with as much detail as possible.

First Implementation – Bag Array

Bag Array is an array implementation that attempts to speed up frequent insert and remove operations by skipping the expensive O(n) operation of shifting elements when an element is removed. With Bag Array, rather than shifting the elements, the cell where the element is removed is marked as “null” and its index is added on top of a stack. This stack is checked each time an element is inserted, if the stack is not empty, its index is removed (popped) and element is added at said index.

The downside of the Bag Array is that it cannot be used for elements where the insert order is important, as element additions can end up anywhere in the array.  Below is the compilation of the good and bad points of this implementation.


  • Fast insert O(1) with exception of O(n) if the maximum size is reached
  • Fast Remove O(1) – skips element shifting


  • Elements can be added anywhere in array, don’t use if order of elements matter
  • Slightly slower iteration speed when iterating the entire array due to null cells
  • Slightly more memory usage due to use of the stack
  • Random access may access a null field

Second Implementation – Shift Array

Shift Array is an alternate implementation to the Array List found in Java.Util package. Unlike the Array List implementation where an O(n) operation if performed every time an element is removed, the Shift Array only performs the O(n) operations when an insert event or an iteration event is requested. This makes it excellent if the user removes elements in bulk without adding anything in between. Adding elements without removing is also an O(1) operation.

The Downside of the Shift Array implementation is that its performance is slightly lower if the user removes a single element and adds a single element. This forces a shift of the elements every addition operation, reducing a worse case scenario of O(n). Below is the compilation of the good and bad points of this implementation.


  • Insert O(n) with first insert operation after remove and O(1) from then on
  • Fast Remove O(1) with a shift of elements upon first insert operation
  • Fast Iteration
  • Bulk Shift of entire array O(n)


  • Reduced to O(n) insert if only a single remove and addition is performed at a time, making this slower than Array List
  • Shift Elements is slightly more complex as it fills gaps on multiple null cells

Third Implementation – Swap Array

Swap Array is a very interesting implementation based primarily on the Bag Array Implementation, with the exception that Swap Array does not have a stack which makes it slightly faster. The primary function of the swap array is that both its insert and remove operations are O(1) complexity, just like Bag Array, however it also suffers from the inability to store data where the insert order is important. Upon removal at random position, the Swap Array “swaps” the data at the tail of the array with the index where the element was removed and the length is reduced by 1. This means that when removing, data in the structure constantly changes their index, so its very difficult to keep the order of the elements.

The Downside of the Swap Array is as mentioned that data within will constantly change their position index, which means this implementation cannot be used if element order is important. Below is the compilation of the good and bad points of this implementation.


  • Fast insert O(1) with exception of O(n) if the maximum size is reached
  • Fast remove O(1) as elements do not shift
  • Fast Iteration as no empty cells


  • Cannot keep order of elements

Final Notes

I have already implemented the following arrays in Java and in the coming days will post them as an open source project. The link will be posted in the “Tutorials and References” section of this blog. I find that these implementations are incredibly useful and I often use them myself during software development.

Tagged , , , ,

Halfway Engine Rework

I’ll be exploring the possibility of engine cleanup for the HGE. The topics include a replacement all in one module for pooling as well as module independence, utility and math library reworks.

As the engine grows, it becomes considerably more difficult to mantain. At the moment, due to almost a year of added features on the engine has created a very large codebase of almost 21k LOC, in order to ensure that the codebase can be easilly maintained and new features easilly added a rework will be required. Below I’ll be discussing possible rework modules, modules that will require a rewrite aswell as new modules.

The cleanup/rewrites are needed for preparation into 3D games, aswell as advanced rendering for the PC games.

  • Modules that require a clean rewrite
  • Math Library
  • Collision Sub Classes
  • Renderer
  • Scene Manager
  • Public Accessor
  • Pooling


  • Modules that require a cleanup
  • Android Core
  • PC Core
  • Common Core
  • GUI System
  • Particle System


  • New Modules
  • AI Module
  • Networking Module
  • Threading Module
  • 3D Scene Manager
  • Utility support for 3D
  • Save/Load Module


  • Iterations
  • Generic Container Optimisations
  • New Generic Container (HashMap)
  • New Generic Container (BST)


  • Math Library

Possibly the easiest rewrite of them all, Math Library is unfortunately a mess, mainly due to poor design desisions that were made in the earlier engine development. Math Library is possibly one of the oldest modules and requires a cleanup, which will replace its Vector with more consistant Vector2 and Vector3 classes and a function cleanup of Matrix2x2, Matrix3x3, Matrix4x4 and Quaternion classes. These classes will have the added benefit of also being able to communicate with openGL for loading and saving of native openGL transform matrices for software transforms. The MathLibrary that holds static functions for math operations will also need to be rewritten. A proposal for a new name is “MathUtils” instead.

  • Collision Sub Classes

The collision system is somewhat working fine, however due to continued addition of new features the whole thing is a mess. A clean abstraction will need to be found for the Collider interface which defines ALL collidable types including Circle, AABB2, AABB3, Sphere, OBB2 and OBB3. OBB2 and OBB3 types will be required to undergo a complete rewrite, a more stable continuous collision module is also proposed to replace the old method. Finally, the Broadhphase detectors GRID and QTREE will also undergo a few changes needed to make continuous collision detection to work. A proposal for GJK is to be added for 3D and Collision module to include a reference to global object list for fast addition and removal.

  • Renderer

The renderer will also be going for a rewrite, which will include a more advanced sprite batcher, backed by native java Buffers which will also be pooled. The overall renderer will include modules for expandability, to be able to cast shadows and provide lighting support. Proposal for Renderer to hold references to the global object list for fast addition and removal. The new renderer will support both old rendering aswell as new rendering techniques with direct shader support.

  • Scene Manager

Scene Manager is plagued by inconsistency, mainly because it was expanded by demand while engine was developed. The new Scene Manager will include a dedicated 2D and 3D managers for both Android and PC and will include support for instantiating and reading/requesting from a server architecture. A new Scene Manager will be created as part of engine dedicated for hosting a server, which the engine can easilly connect to.

  • Public Accessor

Public Accessor is inconsistant, and its very small, only 30 or so line of code. The current accessor “World” will be replaced by acronyms “hge” which will be used for accessing all needed subsystems of the engine, including Scene Managers, Renderers, Collision Modules and core Engine functions.

  • Pooling

An article was written to provide the implementation idea for a new pooling solution called Bitmap Pool. This new all in one pooling is to replace three pooling solutions currently found in engine, including the Generic Pool, the Stack Pool and the Pointer Pool.


  • Android Core

The Android core has not been touched since the first iteration of the engine was developed, it will be getting a much needed cleanup and added functionality. The core by definition will be responsible for communicating with the operating system and includes all Android specific modules, which includes the Input/Output streams aswell as the Input System.

  • PC Core

Following the suit of the Android core, the PC core was developed later to support deployment of the PC Platform. It will share the same function interface as the Android Core and it will house all PC specific modules, which include the Input/Output streams aswell as the Input System.

  • Common Core

Common core was developed later to bridge the gap between android and pc. It includes a simple yet powerful Graphics interface which allows openGL across both Android and PC. The cleanup will add to the existing functionality aswell as depreciate some undeeded modules/interfaces. All engine interfaces will now by definition be part of the common core, including the Android/PC core interfaces.

  • GUI System

The GUI system will be recieving some much needed cleanup, which will include the depreciation of some un-needed functions/variables. The new GUI system will also recieve its own Camera which will make it compatible with a 3D scene. So instead of emulating its own scene space with current camera system, it will work independent of current scene as an additional 2D scene for interaction purposes. It will directly communicate with core for quick-fetching and processing of input events.

  • Particle System

Particle System and all its utility functions/classes will recieve a cleanup which will implement the new Particle System named “State Based Particle System”. The cleanup will also include new modules for performing 3D particle effects needed for a 3D game.


  • AI Module

The AI Module will be a attachable/detachable module which can be used with GameObjects. It will include methods for both programming AI logic externally, and automatic pathfinding. AI module will also be network aware and may make use of the new Threading Module for independent execution.

  • Network Module

The goal of the networking module is quite simple, to provide an interface to be able to host/create multiplayer games. A proposal is to use an existing library tailored specifically at Game Development and tailor it to the engine. There are numerous good libraries available for networking.

  • Threading Module

The threading module will require extensive research to develop properly. It will be used both internally and externally for parralel execution. The interface will include easy to use function such as “merge”, “break”, “queue” and “pause/resume” of independent thread jobs. The library will make extensive use of the new Pool for creating and recycling of Thread objects.

  • 3D Scene Manager

3D scene manager will be an extention of the 2D scene manager, it will be responsible mainly for managing a 3D world.

  • Utility Support for 3D

New utility functions will be added for 3D, including 3D loaders of .OBJ and .COLLADA object formats. A utility for easy shader creation may also be explored.

  • Save/Load Module

The Save/Load module will provide a nice clean interface for saving and loading of game specific data in a generic manner. Options will include to append, replace exiting files aswell as direct communication with Database Systems such as SQL. Real time read/write will also be possible via the new Threading Module which will be developed as part of “new features”


  • Generic Container Optimisations

All generic containers will undergo an optimisation process which will bring them up to speed with normal java containers. Generic Engine containers will also support poolable content.

  • New Generic Container (HashMap)

An optimised HashMap will be added to the generic container list, which will support for collision of content aswell as fast iteration of content.

  • New Generic Container (BST)

A new BST (Binary Search Tree) will also be added, which can be used with the HashMap to handle collision of data effectively.

Tagged ,

Software Optimisations

To optimise or not to optimise, that is the question. Optimisation takes time to do, and sometimes requires an in-depth understanding of both the problem, and the platform that the software is written in. It costs developers time and the employer money, but what are the benefits? and does optimisation always pay off? I’ll be discussing optimisations in this blog from my previous experience of working on the Halfway Game Engine.

  • The first question we will explore, what are the benefits?

Proper software optimisation can give a huge boost to overall software speed, although this is dependent on the size of the overall software and the specific function or module that is targeted for optimisation. Consider the case where a function takes 200ms to compute, it is 400 lines of code and provides critical functionality. There is an opportunity to optimise this function and allow is to compute in 75ms (approx 2.25x speed gain), it will take approximately 1 week to write and will increase the code count to 700.

A few observations can be made from this problem analysis

1) The optimisation will give the said function a 2.25x speed boost

2) It will take one week of developer time in order to write the optimisation (say 8 hours a day at $30 an hour for 5 days) will cost approx $1200 to the employer

3) It will increase LOC (line of code) by an additional 300

So now we have one point that is good, where we gain a pretty good speed bonus, and two points as bad, which takes developer time and increased maintenance due to increase in line of code.

Weather or not to go for this particular optimisation is now up for debate. If the function in question is only called once during the execution of the software, is it really worth it? What if the function in question is executed upon user request, will the optimisation in question require additions or modifications of other software functions in order for it to work? Again, so many questions.

The reality of this analysis is that the function itself mainly trades maintenance for speed, ignoring the fact that it will also cost money to roll out. It will run approximately 2x faster for an approximately 2x more maintenance. Weather or not this is worth it again depends on how often the said function gets used, and the other implications to the overall software for rolling out the optimisation (will it break other functionality?)

  • The second question we will explore, does the optimisation always pay off?

And the answer to this question is again, depends on the use case. I will take this time to talk about a certain optimisation that was performed on the Halfway Game Engine which took 2 weeks to roll out and provided absolutely no speed benefit, even though during benchmarks the new solution was on average 12-16x faster than the previous solution.

The optimisation in question was to replace the default Java’s ArrayList data structure on a number of places in the engine. In some cases, the engine would require the rapid storage and rapid removal of specific objects in a number of places. This introduces a problem with the ArrayList where it constantly performs an O(n) operation each time an element is removed. The new structure was proposed to “force” the O(n) operation on demand upon the structure. For example, rather than performing O(n) each time an object is removed, we could remove 50 objects and then perform the O(n) operation.

During benchmarks this provided an amazing boost to speed, to the point where I spent 2 weeks rolling out new data structures based on this concept. Although benchmarks showed a clear advantage of the new structures as opposed to ArrayList, the reality was that during runtime, they provided no increase to overall speed. Now it begs the question of why?

The answer is simple. The objects in question were only being stored in group of 100’s, 1000’s at most. The reality is that even though this O(n) operation seems expensive in this case, it was actually a very small portion of the overall engines run-times, we are talking about barely a 0.1 – 0.2% of the overall runtime. Even though the new structure had such massive speed gains, it was barely visible in the grand scheme of things. A game running at 60 frames per second on the old solution simply (barely) started fluctuating between 60-61FPS on Android and had no visible speed difference on the PC platform.

What is the moral of the story? Don’t optimise what you don’t have to. Something that only takes up 0.1-0.2% of the entire runtime isn’t even worth touching for optimisations, especially one that takes 2 weeks to develop and roll-out. That time can be better spent elsewhere.

  • One final thing I would like to note about optimisations is LOC (Line Of Count)

LOC, is another valid optimisation that is often ignored. What do I mean about LOC optimisations, and where does it come in handy?

Small scale software can go between 500-2000 line of code, medium software goes from 2000-20,000 and large scale is 20,000+. Software that is high in LOC introduces alot of problems, and almost always has no benefits (other than bragging rights). First problem is maintenance, the higher the LOC the more maintenance the system requires which takes developer time and costs the employer money. The second problem is bugs, no software in the world is without bugs, a good number (and obvious ones) and fixed during development and refinement, others are found during production. Sometimes the bugs can be so severe in production that it would require a complete rewrite or an extensive design change of a major part of the system. Higher LOC introduces the problem of being able to find these bugs, where they are and how to fix them, this is an even bigger problem during runtime systems, it can take days, if not weeks to find the source of the bug, which often happens to be a single line of code.

Consider the functions below, they perform the exact same thing at exact same speed. Which version would you write?


Optimising LOC comes from experience and good coding standards, the trade between LOC and readability sometimes needs to be made. Writing software with the intention of increasing its LOC for bragging rights will sooner or later run into problems, weather that be a bug that cannot be found within the mess, or just readability issues to the point where the developer would rather start from scratch than modify existing code.

This little article has been written with regards to experience I’ve gained in working on the Halfway Game Engine, I hope it delivers the right ideas and is of some use to you the reader =]


Tagged , , ,

Ballmer Peak

It was only two years ago, when the time came to face the horrible reality of having to finally begin my assignment, with only 18 hours left until the deadline. I coded the assignment until midnight and decided to pull an all nighter. The assignment was due 11:59 am (why is 11:59 am a good time for a deadline is beyond my analysis. Perhaps they KNEW that most students would only start the day it was due in, so they can pull an all nighter trying to finish it).

So my next step was to grab a bottle of fine whiskey, something to manage to keep me half awake. Now here comes the most difficult part of the assignment, Object Serialization. What we had to do was save the object state onto a text file when program shut down, and load it back up when the program started again. A sort of, mini database if you will. Now at the time this was something I had never done before, reading and writing to and from text files was beyond my ability. I started quickly flipping through lecture notes and managed to make the start with the assignment. This is where things get very interesting.

I woke up the next morning at proximately 10:00 am with a massive headache and half a bottle of whiskey gone. Under panic that I failed to complete the assignment I re-opened my Eclipse project only to find that the assignment was complete, almost to perfection. I tested and retested the assignment and everything seems to be running fine, no crashes, no errors, amazing. Hell, I seem to have did myself a favour of even submitting my assignment online before I crashed on the couch.

So then I began looking for an explanation. Could my guardian angel perhaps been a programmer in his (or her) previous life? Did a thief decide to rob me but instead thought its a good idea to finish my assignment instead? Well, as it turns out, just recently I discovered the term of “Ballmer Peak”, which in my opinion provides a very nice explanation of this phenomenon. I will let this cartoon tell the rest about the Ballmer Peak.



Now, I’ve actually attempted to hit the “Ballmer Peak” previously after this incredible incident, however all my attempts have been futile. The best result so far (besides the assignment) was the creation of the next generation MMORPG that seemed to have been coded with my forehead, and another project that was typed using three different languages which to me, made absolutely no sense.

Regardless of what it was, I managed to get full marks for the assignment and passed with flying colours.

Tagged ,

Object Pooling

Object pooling is a relatively dead term to modern java developers, mainly because the new generation JVM  is extremely fast when creating/collecting new objects to the point where we don’t really need to worry too much about it as far as performance goes. This holds true even for high performance real time applications such as Games or Game Engines. The only time we would pool objects is if the object in question take s along time to setup, such as Threads and Database Connection Objects, but even then, Java provides some form of pooling utility such as the Executor framework for Threads.

But what about other VM’s? For example, the Android Dalvik VM is so slow with garbage collection that it should be avoided like the plague. It is so slow in fact that we succumbed to pooling almost every object in our Game Engine, even quick use objects such as Vectors and Matrices were reused as much as possible.

Weather or not to use a pooling solution is up for debate. It is definitely practical for use for Android applications, probably not very practical for desktop applications. Regardless, it is always good to have some nice framework available for when pooling is actually needed. During the development of the Halfway Game Engine numerous pooling solutions were put into place. Today I’ll share the idea for the new pooling solution which should be relatively fast and practical for most needs.

The current design for this pool is simple. Removal from the pool should be O(1) and recycling should also be O(1). It will use a dynamic resizable array for storing of the pool objects. Unlike most pooling solutions, this pool is actually responsible for the creation of the objects, and thus it is also responsible for the destruction of said objects.  The pool will have cells of “n” size, and a pointer which points at any cell within the ” 0 to n” range. Anything below the pointer are “free” objects, anything above the pointer are “used” objects. With this we can draw a few diagrams to demonstrate.



Diagram 1: Consider a pool of size 8. Its current cells marked in red are unoccupied, ie NULL cells.

Diagram 2: Consider the same pool with 4 allocated cells which are currently in use by the application.

Diagram 3: Consider element 1 is now requested by the application to be recycled, it is noted in blue.

Diagram 4: The Pool now after the swap of elements, note the pointer marked with the black arrow. It is pointing at the first free element.

Diagram 5: Consider element 0 is now requested by the application to be recycled, it is noted in blue.

Diagram 6: The Pool again after the swap of elements, again note that the pointer is pointing at the first free element, and anything above it is also free (or NULL)

So with this we can come to a few conclusions with this pool design.

  • The pointer will ALWAYS need to point at a free element. But as seen from diagram 2, the free element may not have been allocated yet, which means each time a request for an object is made, a null check is needed.
  • The pointer left cell will ALWAYS need to be in use if pointer > 0.
  • The pointer right cell will ALWAYS need to be free or null, if pointer < Pool maximum length.

Another observation. Unfortunately due to the design of the new pool type, the objects themselves will need to be pool aware. At the very least an ID will need to be stored inside the object which represents the cell the object currently belongs to. In the case of a shift in elements, this ID will also need to change if the object current ID != pointer – 1. As far as how to handle or store the ID, that will come from the developer. In the coming weeks I will develop this pool type and hopefully benchmark it to some extent with existing pool designs.


Tagged , , ,

Don’t reinvent the wheel

This is a fairly common term known to those of us who write software, its been drilled onto our brains since the first semester of University or that high school programming course. But is it really such bad practice to reinvent the wheel?

Reinvention takes time, if your favorite programming language already has a “Linked List” data structure implemented, perhaps it is bad practice to try and reinvent your own. This also holds true to those of us who are currently working on projects where our time is very valuable. It is probably better to work on that new feature you’ve always wanted for the “Particle System” rather than spend a week developing and debugging a data structure that already exists.

The term in my opinion does not hold true to those of us who are simply waiting to start our next term of University, and have nothing better to do. Reinventing the wheel can be fun, challenging and of course, very educational. Simply holding on to the term of “trust the implementation” may not always hold true, especially when working on real time systems. Sometimes, it is better to trade space for speed, and other times it is more beneficial to trade speed for space. My post on Boolean Bit Flags is such an example of a container which trades speed for space. As suggested by the book “Game Engine Architecture” by Jason Gregory, it is sometimes better to roll your own data structures, which can be optimized for specific tasks. It all depends on how much time is available, and what the gains are. If all you’re doing is gaining 2ms speed on a container that would normally take 200ms to complete, perhaps it is not worth the time.

I’ll most likely be spending my next two weeks on trying to roll my own generic containers in Java, which will be optimized specifically for Games Development and real time systems in general. I have written generic data structures for the “Halfway Game Engine” before, however I found them to be on average slower than Java’s default implementations, having said that, I did not have the time to optimize them as much as I would have liked.

Tagged ,

New Java Project – JColl

To kill some spare time and to keep my programming skills sharp, I’ve decided to start a new project called Jcoll. Jcoll will be a lightweight collision detection system used for 2D and 3D Game Development. It will be self contained with its own math library and will provide an easy to use common interface for incorporating with other java projects.

The project itself will be open source. The following features are planned.

Able to detect and solve intersections for the following Collider types.

  • Circle (2D)
  • AABB2 (2D)
  • OBB2 (2D)
  • Sphere (3D)
  • AABB3 (3D)
  • OBB3 (3D)
  1. The System will not be able to solve collisions between 2D and 3D types, for obvious reasons. For example, a circle cannot collide with an AABB3 since it is missing an extra dimention.
  1. The system will either provide a single broadphase solution, or multiple solutions which can be configured at startup. Currently Quadtree is planned for 2D and Octree is planned for 3D.
  1. The system will be capable of performing collision queries at runtime, including Ray casting.
  1. The system will be able to return data or sets of data which can later be used with existing Physics systems. However Jcoll will NOT simulate physics, its sole purpose is collision detection.
  1. Internal pooling and ID system will be used for tracking collidable objects, this will make Jcoll independent of the engine it is incorporated with.
  1. Jcoll should be fast enough to be used for mobile devices, including Android.
  1. There are currently no plans to include pixel perfect collision detection in Jcoll.
  1. Jcoll will provide an abstract class called “JObjectBase” which will implement the “Collidable” interface. All objects wishing to be compatible with Jcoll must either extend the abstract class, or implement the Collidable interface from scratch.
  1. Jcoll will support both “Static” type objects and “Dynamic” type objects. Jcoll will also support a hierarchy of collider types which a single object can share. Jcoll will also support “offets” for offsetting the position of the colliders center.

With these basic feature list in mind, I will begin coding Jcoll within the next few weeks. I suspect a base version will be available in mid-late january. I may port the system onto C/C++ if there is time. The project itself will be hosted as open source as soon as a working base version is available. Tutorials will be written as development continues.

Tagged , ,