Best practices

This section outlines best practices that you should follow when you are creating an OpenGL ES app.

App design

The following list provides best practices for app design.

Best practices

Manage the flow of control from the app to the GPU. Use parallelism (or parallel processing) when appropriate. For more information, see Parallel processing.

Choose a target device (or target devices) and set a benchmark for performance of your app on these devices. For more information, see Performance.

Access the default framebuffer using OpenGL ES only. Some GPUs use deferred rendering, so not all your drawing commands run immediately. The commands are put into a queue and run as needed. Don't access the default framebuffer from the CPU. This practice flushes the drawing commands that are in the queue and then your app must wait for all of thee commands to finish.

Specify the fixed frame rate that you want to target. The smoothest animations are run at a constant rate.

Flush the OpenGL ES command queue sparingly. Generally, you should avoid flushing operations. Because rendering is deferred, not all drawing commands are run immediately. The glFlush() function renders everything in the queue and waits for everything in the queue to finish, which is a time-consuming operation. Similarly, if you query OpenGL ES states using glGet*() or glGetError(), all drawing commands are run so the state variables are in the correct state. You should make these calls at the start or end of frame rendering.

Always use double buffering. You should attach at least two buffers for all app windows to avoid flickering, tearing, and other artifacts. If you have a single buffered window, visual faults occur while rendering, because the app is rendering to the same window as the compositor. Single buffered windows are supported by the windowing system, but visual faults are likely to occur.

Double buffering also allows you to prepare your next frame while the previous frame is rendered. Your app renders to the back buffer and the compositor renders the front buffer. Double buffering can also help avoid resource conflicts when your app and OpenGL ES access the same object.

Use OpenGL ES objects when possible. OpenGL ES lets you store data types persistently. If you use OpenGL ES objects to store your data, OpenGL ES can reduce the overhead of transforming the data and sending it to the GPU. If that data is used multiple times, OpenGL ES can significantly improve your app's performance.

Parallel processing

Parallel processing (or parallelism) is the concept of multiple processes running at the same time, either on a multicore processor or interleaving on a single processor. Parallelism allows for increased app throughput and higher responsiveness for input and output. To achieve parallelism in a single app, you can use threads to partition and run the threads in parallel on the processors. OpenGL ES apps inherently use parallelism between the CPU and GPU, but to varying degrees depending on the structure of the code. There is also the option of multiple processes running on the processor and communicating through interprocess communication techniques.

Because BlackBerry 10 is POSIX-compliant, the BlackBerry 10 Native SDK supports pthreads. BlackBerry 10 also supports QThreads or Boost threads.

Each OpenGL ES rendering context targets a single thread of execution. If a thread has more than one EGLContext, the results are undefined.

Best practices

Determine if you can benefit from parallelism. Turning your app into a multithreaded app can be a lot of work. You should focus on CPU-GPU parallelism first, and then on threading or child processes. To decide whether your app should use threading or multiple processes, check whether your app performs many tasks independent of OpenGL ES rendering, such as artificial intelligence calculations, sounds, or simulating a game world. For example, if you use an app profiler that shows that the GPU is idle more often than the CPU, you can try to split your CPU tasks to achieve higher throughput.

Implement parallelism in your OpenGL ES app. To implement parallelism, you want to provide the GPU with work constantly. You should focus on achieving CPU parallelism to off-load more work to the GPU. If you want to implement parallelism in your app, consider the following approaches:

  • Separate work that is related to OpenGL ES from work that is not related to OpenGL ES.
  • If your app contains CPU-bound work, separate this work into multiple threads and make sure that OpenGL ES draws to a single thread.
  • If you want to render multiple scenes, separate each scene into its own thread, making sure that each thread has its own context

Performance

Performance varies depending on the hardware that is available to you. When you create an OpenGL ES app, you are targeting an embedded software device, which might have less memory and battery power than a computer.

Best practices

Profile your app. Profiling your app allows you to determine where your app's performance is limited by specific resources or components. The Momentics IDE for BlackBerry provides tools to help you profile your apps. For more information, see Analyze allocation patterns.

Use the smallest amount of memory possible.

  • Always free memory when you're done with it. For example, after linking your shaders to a program object, delete the shader, and free the vertex data.
  • If you don't need all your resources at one time, separate them into subsets. For example, if your app has levels, separate the resources that you need for each level and load the resources when you need them.
  • Make sure that your data types use exactly what they need. If you expect data values to range from 0 to 255, use an unsigned char instead of an int.

Use simple lighting models. Lighting requires extra calculations, so use them only when they are necessary. You can also calculate your lighting colors early in your app, then store them in a texture to sample.

Minimize the number of state settings and draw calls. Setting OpenGL ES state values many times between drawing calls reduces the performance of your app. Try to avoid these redundant calls by saving a copy of the current state settings. Every time you call OpenGL ES drawing commands, the CPU prepares them for processing on the GPU. You can reduce this CPU work by batching your draw calls. If you want to draw a simple 2-D square, use a triangle strip that uses fewer primitive components instead of using two separate triangles.

  • Set the state at most once between drawing calls.
  • Set the state that affects only the next drawing call.
  • Set the state only when it changes.

Shaders

You use shaders to determine the appropriate levels of light and dark in your image. You use the OpenGL ES Shading Language to define your shaders and use them to specify rendering effects in your app. The OpenGL ES 1.1 API uses a fixed-function pipeline, which means you can use only the pixel-shading and geometric transformations that are available. The OpenGL ES 2.0 and 3.0 APIs use a programmable pipeline, which means that you have more control over what is rendered. This section outlines some best practices for optimizing your shaders.

Best practices

Pick the appropriate precision. The PowerVR SGX540 platform supports multiple types of precision, and picking the right balance is important. Choosing lower precision increases performance, but it can also introduce artifacts. Generally, you should start with high precision and gradually reduce the precision level until artifacts appear.

High precision is represented by 32-bit floating-point values. Use this precision for all vertex position calculations, including world, view, and projection matrices. You can also use it for most texture coordinate, lighting, and scalar calculations.

Medium precision is represented by 16-bit floating-point values. This precision typically offers only a minor performance improvement over high precision, but it can reduce storage space that you can use for storing varying variables for texture coordinates.

Low precision is represented by 10-bit fixed point values, ranging from -65520 to 65520. The precision is useful for representing colors and reading data from low-precision textures.

Reduce the number of varying variables you use. Varying variables represent the outputs from the vertex shader. They are interpolated across a triangle and then fed into the fragment shader. Try to use as few varying variables as possible, because each one uses buffer space for parameters and processing cycles for interpolation. You can reduce the space and memory required to store a whole scene in a parameter buffer by using a lower precision. The PowerVR SGX540 platform supports up to eight varying variables between the vertex and fragment shaders.

Make uniform updates that are unique. Uniform variables represent values that are constant for all vertices or fragments, and they are updated on a per-vertex basis. Try to avoid redundant uniform updates between drawing calls, as uniform updates can be a time-consuming operation. You should also be careful about how many uniform variables you use, because uniform variables need more memory and time to run.

When you perform uniform calculations, always ensure that uniform variables are processed first:

uniform highp mat4 modelview, projection;
attribute vec4 modelPosition;
gl_Position = (projection * modelview) * modelPosition;

Isolate vector and scalar calculations. Not all GPUs include a vector processor. Some GPUs perform vector calculations on a scalar processor. Depending on the order of the calculations, equations can be evaluated using more multiplications than necessary on scalar processors. On vector processors, each multiplication is processed in parallel with each other in vector calculations. Generally, you want to isolate similar calculations first. The following code sample keeps scalar calculations isolated as long as possible:

highp vec4 v1, v2;
highp float x, y;
v2 = v1 * (x * y);

Compile and link shaders. You should compile and link your shaders to a program object at the start of your app. Compiling and linking can be a time-consuming operation, so you should perform them at the start of your app.

After you compile and link your shaders to a program object, you should delete the shaders, which means you have more memory to work with. When you compile your shaders, you should always check for errors; otherwise there could be error that you don't expect. You should also check the link status to confirm that everything went well.

Textures

A texture is an OpenGL ES object that contains one or more images that have the same format. This section outlines some best practices you should follow when you create textures.

Best practices

Use an appropriate texture size. A common misconception about textures is that bigger textures look better on the screen. Using the maximum texture size that displays on part of the screen uses memory unnecessarily. Pick your texture sizes by examining where the textures are used. Generally, you want to map one texel to every pixel that covers the object from the distance closest to the view point. Try to reduce your textures uniformly. The PowerVR SGX540 platform supports non-power-of-two textures to the extent required by the specification. Non-power-of-two textures don't support mipmapping.

Load a texture during initialization. Loading a texture can be a time-consuming operation because the PowerVR SGX540 platform uses a layout that follows a plane-filling curve to improve memory locality when texturing. You should load a texture when the app or a level starts. You should avoid loading texture data mid-frame. Also, you should set your texture parameters before you load it with image data because OpenGL ES can optimize your texture data based on the parameters that you set.

Compress your textures. Texture compression conserves memory, increases performance, and allows for mipmapping. The PowerVR SGX540 platform supports PVRTC and ETC texture compression formats. For more information, see Texture compression.

Use mipmaps. Mipmaps are small predefined variants of a texture image. Each mipmap represents a different level of detail for a texture. The GPU can use mipmaps with a minification filter to calculate the level of detail that is closest to mapping one texel of a mipmap to one pixel in the render target.

Texture compression

To optimize your OpenGL ES app, you can compress the textures that you use. Texture compression is a specialized form of image compression that's designed for storing texture maps. Support for texture compression formats varies by GPU and you can see a list of BlackBerry 10 devices and their corresponding GPUs in Capabilities by graphics platform. The following sections describe the benefits of compressing textures, the texture compression formats that are available on BlackBerry 10 devices, and the texture compression pipeline.

Benefits of texture compression

Decreases memory size and bandwidth requirements
Compressed texture data uses less memory than uncompressed texture data. Compressing textures greatly reduces the memory and bus bandwidth needed to read textures.
Increases performance
Because compressed textures have reduced bandwidth requirements, there is a performance improvement when transferring data using the Accelerated Graphics Port and from the default framebuffer.
Supports larger textures and more textures
Larger textures typically result in more surface detail and a smoother look. Because texture sizes are smaller, you can store more textures as well.
Allows for mipmapping
The extra memory you gain from texture compression allows you to use mipmaps, which help reduce aliasing artifacts on textured surfaces. Mipmaps require extra memory, but texture compression uses the finite system memory more efficiently.

Texture compression formats

If you use glGetString() to query the extensions that BlackBerry 10 supports, you can get a list of the texture compression formats that are available.

3Dc texture compression

BlackBerry 10 devices that use the Qualcomm Adreno GPU support the 3Dc texture compression format. The OpenGL ES extension is GL_AMD_compressed_3DC_texture. If this extension exists, you can use 3Dc compression. The following formats are supported:

  • 3DC_X_AMD
  • 3DC_XY_AMD

To compress your textures, you can use the Compressonator tool.

ATITC texture compression

BlackBerry 10 devices that use the Qualcomm Adreno GPU support the ATITC (ATC) texture compression format. The OpenGL ES extension is GL_AMD_compressed_ATC_texture. If this extension exists, you can use ATC compression. The following formats are supported:

  • ATC_RGB_AMD
  • ATC_RGBA_EXPLICIT_ALPHA_AMD
  • ATC_RGBA_INTERPOLATED_ALPHA_AMD

To compress your textures, you can use the Compressonator tool.

Ericsson texture compression

BlackBerry 10 devices that use the Qualcomm Adreno or PowerVR GPU support the Ericsson texture compression (ETC1) format. The OpenGL ES extension is GL_OES_compressed_ETC1_RGB8_texture. ETC1 doesn't support an alpha channel, so you use fully opaque textures only. To compress your textures, you can use the etcpack tool.

PowerVR texture compression

BlackBerry 10 devices that use the PowerVR GPU support the PowerVR texture compression (PVRTC) format. The OpenGL ES extension is GL_IMG_texture_compression_pvrtc. If this extension exists, you can use the PVRTC format. The extension provides additional functionality that's specific to the PVRTC format, but it supports precompressed images only. The following formats are supported:

  • GL_COMPRESSED_RGB_PVRTC_4BPPV1_IMG
  • GL_COMPRESSED_RGB_PVRTC_2BPPV1_IMG
  • GL_COMPRESSED_RGBA_PVRTC_4BPPV1_IMG
  • GL_COMPRESSED_RGBA_PVRTC_2BPPV1_IMG

To precompress images, you can use the Imagination Technologies PVRTexTool.

Texture compression pipeline

This section discusses the texture compression pipeline and how to use texture compression in your app. You can check which formats are available, and then compress the textures with a tool. This section outlines how to use the ETC1 compression format, but you can apply this approach to other formats as well.

Best practices

Check which formats are available. You can check which formats are available by retrieving the number of formats that are supported and then retrieving all of the available formats.

GLint num_formats;
GLenum *compress_formats = NULL;
glGetInteger( GL_NUM_COMPRESSED_TEXTURE_FORMATS, &num_formats);

compress_formats = malloc( num_formats * sizeof(GLenum) );
glGetIntegerv(  GL_COMPRESSED_TEXTURE_FORMATS,
    (GLint *) compress_formats );

Now you have a list of symbolic constants that texture compression formats are available for. You can loop through the list and check for the format you are looking for.

int i;
for( i=0; i<num_formats; i++ ){
	if(compress_formats[i] == 
        GL_OES_compressed_ETC1_RGB8_texture ){

Compress your image. You can use a tool such as etcpack to compress textures, but the image must be in PPM format. The etcpack tool converts .ppm image files to .ktx or .pkm files, which are compressed using the ETC1 format. To convert other image files to .ppm files, you can use a tool such as XnConvert.

To use XnConvert, you navigate to the folder that contains nconvert.exe, and then run nconvert.exe. Here is an example of using nconvert.exe to convert a .png file to a .ppm file:

nconvert -out ppm example.png

To use the etcpack tool, you navigate to the folder that contains etcpack.exe, and then run etcpack.exe. Here is an example of converting a .ppm file to a .ktx file using the ETC1 compression format:

etcpack example.ppm example.ktx

Use the compressed texture. First, you need to enable some OpenGL ES features to use textures and textures compression. You start by turning on the ability to use 2-D images by calling glEnable(GL_TEXTURE_2D). Generally, you can call this function once at initialization and continue, because drawing without creating a texture first is still possible. You enable blending by calling glEnable(GL_BLEND), which lets us combine images in different ways. Then, you specify the type of blending to use. For example, if you set the source to GL_ZERO, each color channel is multiplied by 0.0 which makes them fully transparent. The parameter GL_SRC_COLOR specifies that you want to use the color from the source image. For more information about these constants, see the glBlendFunc() manual page on the OpenGL ES website.

glEnable(GL_TEXTURE_2D); 
glEnable(GL_BLEND);
glBlendFunc(GL_ZERO, GL_SRC_COLOR);

Next, you create the texture by declaring a GLuint array and specifying the number of textures you want. Then, you generate the textures using glGenTextures().

GLuint textures[1];
glGenTextures(1, textures);

You also need to bind the texture before you add the image data, so you call glBindTexture(). Binding a texture makes a particular texture active and only one can be active at a time. You pass the GL_TEXTURE_2D argument to indicate that you're using a 2-D image for our texture.

glBindTexture(GL_TEXTURE_2D, textures[0]);

You set any texture parameters you need. For example, here you specify a 2-D texture and a magnification filter constant to use when a pixel maps to an area less than or equal to one texture element. You also specify GL_LINEAR to indicate that you want to use a weighted average of the four surrounding texture elements.

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);

Now you load the image data. If you use image data that is uncompressed, you could use the following code:

glTexImage2D(GL_TEXTURE_2D, 0, format, tex_width, tex_height, 0, 
    format, GL_UNSIGNED_BYTE, image_data);

Or, you can load a compressed ETC texture:

glCompressedTexImage2D(GL_TEXTURE_2D, 0, GL_OES_compressed_ETC1_RGB8_texture, 
    tex_width, tex_height, 0, image_data_size, image_data); 

Vertices

Generally, your app configures the graphics pipeline and submits the primitive elements that you want to draw. Regardless of which primitive elements you use or how you configure your pipeline, your app provides vertex data to OpenGL ES. A vertex consists of one of more attributes, such as the position, color, or texture coordinates. OpenGL ES 2.0 and 3.0 let you define custom vertex attributes, but OpenGL ES 1.1 uses attributes that are defined in the fixed-function pipeline.

Best practices

Use the most efficient triangle primitive element. OpenGL ES supports three types of triangle-based primitive elements: triangle lists (separate triangles in a list), triangle strips, and triangle fans. All three types can be indexed or non-indexed. Triangle strips are as flexible as triangle lists, but on the PowerVR SGX540 platform, triangle lists are the most efficient.

Use interleaved vertex data. There are several ways to store vertex data. You can interleave it so that all of the data for one vertex follows all of the data for the previous vertex. Alternatively, you can keep attributes in separate arrays or all in one array. In general, interleaved vertex data gives better performance because all data that is required to process each vertex can be retrieved in one sequential read, which improves cache efficiency. If you have a vertex attribute array that you want to share across several meshes, putting this attribute in its own sequential array results in better performance.

Use vertex buffer objects to store data. Use vertex buffer objects (VBOs) to store vertex and index data so that OpenGL ES can perform optimizations on the data. Don't create a VBO for every mesh. Consider grouping meshes that are rendered together to minimize buffer rebinding. For dynamic vertex data, define one buffer object for each update.

Simplify your vertex models. Because mobile devices have smaller screens than computers, images that you display on the screen are often small. You don't need complex vertex models to render the compelling graphics that you want. Here are the general guidelines you should follow:

  • Reduce the number of vertices that you use for your model.
  • Use multiple versions of your model at different levels of detail. If the model is far from the view point (it is smaller on the screen), a less detailed model is ideal because the additional detail isn't noticeable.
  • Use textures if possible.

Cascades and OpenGL ES

Here are some specific recommendations that you should use to create efficient OpenGL ES apps in Cascades.

Best practices

Create efficient threads. Run all OpenGL ES components in one run loop to minimize the number of threads that you use.

Use one OpenGL ES context for all components or OpenGL ES objects. Minimize the number of times you switch EGL contexts. OpenGL ES restricts one EGL context to one thread.

Suspend the OpenGL ES thread whenever possible.

Use overlays for your OpenGL ES input controls. Overlays are windows that lay above your main Window control. You use overlays for your transparent and transient 3-D controls. However, using transparent overlays adds a blending cost during rendering, which can decrease performance. Your overlay control receives input events directly since it is placed above your main Cascades window—specifically BlackBerry Platform Services and Screen events. When input events trigger an animation or effect, update the state only of the component that is in the event handler and change the next frame that is rendered.

Use underlays for your OpenGL ES scene. Underlays are windows that are placed beneath Cascades windows and are displayed in a ForeignWindowControl. They are typically opaque and are ideal for complex 3-D visualizations and visual effects that take up a big part of the screen. Your ForeignWindowControl components receive input events directly, which you can pass down to your OpenGL ES view objects through the Cascades signals and slots.

Use signals and slots for handling underlay events.

  • Make sure that your OpenGL ES object is a Q_OBJECT.
  • Define properties for manipulating OpenGL ES state and data.
  • Define slots for extra functionality received from input events.
  • Use qmlRegisterType() to register a custom C++ class as a QML type for use in QML.
  • To pass an array of data, create a JavaScript array in QML code and pass it to any property that uses a QVariantList. You need to pass JavaScript objects that can handle QVariantMap objects. By default, QML recognizes the following data types: bool, unsigned int, int, float, double, qreal, QString, QUrl, QColor, Qdate, QTime, QDateTime, QPoint, QPointF, QSize, QSizeF, QRect, QRectF, QVariant, QVariantList, QObject, and enumerations that you declare with Q_ENUMS().

Align underlays and foreign windows. You can connect foreign windows to an OpenGL ES window (Screen API and EGL), but you must explicitly attach the two. If you don't attach them, to avoid artifacts, always track the foreign window position and align the OpenGL ES window to it. There are two main approaches when you align underlays and foreign windows:

  • Poll the foreign window dimensions using update().
  • Connect a signal that reflects a change to the foreign window property change.

Last modified: 2015-04-24



Got questions about leaving a comment? Get answers from our Disqus FAQ.

comments powered by Disqus