Texture compression in Papy Stampy

Texture compression ? What for ?

I recently observed that PapyStampy was fillrate limited on some Android Devices (e.g. on Asus Transformer). I therefore decided to implement Texture compression in order to improve the performances.
On Android devices several compressions are available, depending on the devices.

  • Ericsson Texture Compression (ETC)
  • ATI texture compression (ATITC)
  • PowerVR Texture Compression (PVRTC)
  • S3/DX Texture Compression (S3TC/DXTC)

I early decided to skip ETC texture compression as alpha channels are not supported. I also decided to encapsulate the compressed textures in DDS files. As a matter of fact DDS file format has several advantages:

  • On the contrary of raw textures, it is possible to get the texture resolution and several other properties.
  • It is possible to have several mipmap levels

 

How to compress the textures ?

I used several command-line softwares in order to compress the textures:

  • The compressonator : an AMD tool that allows to make several texture compressions
  • ImageMagick : a tool that allows to make several image editions
  • PvrTexTool: a tool to compress textures in the PVR formats

I created the following Python script in order to automate the texture compression in PapyStampy. This script creates DDS files with suffixed filenames telling the compression used (e.g mytexture.png.atitc.dds). When the application tries to load a texture, it looks for a compressed version of the same texture by appending the suffix to the texture filename.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import os
 
# Path where the source textures are located
g_sourceDirectory =   os.path.join(os.curdir , "assets", "levelBuilder", "textures")
g_destDirectory =   os.path.join(os.curdir , "test")
 
# List of supported extensions
g_supportedExtensions = ["png", "jpg"]
 
g_ImageMagikPath = "D:\\Dev\\tools\\ImageMagick-6.8.9-7\\convert.exe"
g_CompressonatorPath = "TheCompressonator.exe"
g_PvrTexToolPath = "C:\Imagination\PowerVR\GraphicsSDK\PVRTexTool\CLI\Windows_x86_32\PVRTexToolCLI.exe"
 
g_codec_ATIC = "ATICompressor.dll"
g_codec_S3TC = "dxtc.dll"
 
g_fourCC_S3TC = "DXT3"
g_fourCC_ATIC = "ATCA" # "ATI2N"
g_fourCC_PVRTC= "PVRTC1_4"
 
def FlipImage(filepathSrc, filepathDst):
    cmd = "%s \"%s\" -flip -alpha on -channel RGBA PNG32:\"%s\"" %(g_ImageMagikPath, filepathSrc, filepathDst)
    #print cmd
    os.system(cmd)
 
def ConvertWithCompressonator(filepathSrc, filepathDst, fourCC, codec):
    cmd = "%s -convert \"%s\" \"%s\" -codec %s +fourCC %s" %(g_CompressonatorPath, filepathSrc, filepathDst, codec, fourCC)
    #print cmd
    os.system(cmd)
 
def ConvertWithPvrTexTool(filepathSrc, filepathDst, format): #  -flip y
    cmd = "%s -f %s -i \"%s\" -q pvrtcbest  -o \"%s\"" %(g_PvrTexToolPath, format, filepathSrc, filepathDst)
    #print cmd
    os.system(cmd)
 
def ProcessFileS3TC(filepath, filename, extension, flippedImagePath):
    # Flip the image
    if flippedImagePath == None:
        flippedImagePath = os.path.join(g_destDirectory , "tmp.%s" %extension)
        FlipImage(filepath, flippedImagePath)
    # Encode
    finalImage = os.path.join(g_destDirectory , "%s.s3tc.dds" %filename)
    ConvertWithCompressonator(flippedImagePath, finalImage, g_fourCC_S3TC, g_codec_S3TC)
 
def ProcessFileATITC(filepath, filename, extension, flippedImagePath):
    # Flip the image
    if flippedImagePath == None:
        flippedImagePath = os.path.join(g_destDirectory , "tmp.%s" %extension)
        FlipImage(filepath, flippedImagePath)
    # Encode
    finalImage = os.path.join(g_destDirectory , "%s.atitc.dds" %filename)
    ConvertWithCompressonator(flippedImagePath, finalImage, g_fourCC_ATIC, g_codec_ATIC)
 
def ProcessFilePVRTC(filepath, filename, extension):
    # Encode
    finalImage = os.path.join(g_destDirectory , "%s.pvrtc.dds" %filename)
    ConvertWithPvrTexTool(filepath, finalImage, g_fourCC_PVRTC)
 
def ProcessFiles(dir, extension):
    # Process all the files of the directory
    for file in os.listdir(dir):
        if file.lower().endswith(".%s" %extension):
            filepath = os.path.join(dir, file)
            # Compute the flipped image as it will be reused
            flippedImagePath = os.path.join(g_destDirectory , "tmp.%s" %extension)
            FlipImage(filepath, flippedImagePath)
            # Process for S3TC
            ProcessFileS3TC(filepath, file, extension, flippedImagePath)
            # Process for ATITC
            ProcessFileATITC(filepath, file, extension, flippedImagePath)
            # Process PVR
            ProcessFilePVRTC(filepath, file, extension)
 
ProcessFiles(g_sourceDirectory, "png")

This script was written for my configuration and will need to be updated depending on your OS and depending on the location of your binaries/source textures/…

How to load compressed textures in Android?

As explained earlier, the compressed textures are encapsulated in DDS file. This link gives some details on the format.
The following sample code shows how a compressed texture can be loaded from the input stream of a DDS file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
 
public class TextureCompressionLoadingSample {
 
	// DDS fourCC
	public static final int FOURCC_DXT1 = 0x31545844;
	public static final int FOURCC_DXT3 = 0x33545844;
	public static final int FOURCC_DXT5 = 0x35545844;
	public static final int FOURCC_ATI2 = 0x32495441;
	public static final int FOURCC_PTC4 = 0x34435450;
	public static final int FOURCC_ATCA = 0x41435441;
 
	// S3TC internal formats
	// http://www.khronos.org/registry/gles/extensions/NV/NV_texture_compression_s3tc.txt
	public static final int GL_COMPRESSED_RGB_S3TC_DXT1_EXT   = 0x83F0;
	public static final int GL_COMPRESSED_RGBA_S3TC_DXT1_EXT  = 0x83F1;
	public static final int GL_COMPRESSED_RGBA_S3TC_DXT3_EXT  = 0x83F2;
	public static final int GL_COMPRESSED_RGBA_S3TC_DXT5_EXT  = 0x83F3;
 
	// ATITC internal formats
	// http://www.khronos.org/registry/gles/extensions/AMD/AMD_compressed_ATC_texture.txt
	public static final int ATC_RGB_AMD                       = 0x8C92;
	public static final int ATC_RGBA_EXPLICIT_ALPHA_AMD       = 0x8C93;
	public static final int ATC_RGBA_INTERPOLATED_ALPHA_AMD   = 0x87EE;
 
	// PVRTC internal formats
	// http://www.khronos.org/registry/gles/extensions/IMG/IMG_texture_compression_pvrtc.txt
	public static final int COMPRESSED_RGB_PVRTC_4BPPV1_IMG   = 0x8C00;
	public static final int COMPRESSED_RGB_PVRTC_2BPPV1_IMG   = 0x8C01;
	public static final int COMPRESSED_RGBA_PVRTC_4BPPV1_IMG  = 0x8C02;
	public static final int COMPRESSED_RGBA_PVRTC_2BPPV1_IMG  = 0x8C03;
 
 
	// Texture ID
	int m_textureID = -1;
	// Tmp buffer
	byte m_tmpBuffer[] = new byte[124];
 
	public boolean createCompressedTextureFromDDS(GL10 gl, InputStream is)
	{
		boolean res = true;
		try
		{
			// Read the file type
			if (is.read(m_tmpBuffer, 0, 4) != 4)
			{
				Log.e("TEXTURECOMPRESSION", "TEXTURECOMPRESSION: DDS file too short for file type");
				is.close();
				return false;
			}
 
			// Check file type
			if (m_tmpBuffer[0] != 'D'
				|| m_tmpBuffer[1] != 'D'
				|| m_tmpBuffer[2] != 'S'
				|| m_tmpBuffer[3] != ' ')
			{
				Log.e("TEXTURECOMPRESSION", "TEXTURECOMPRESSION: Wrong DDS file type");
				is.close();
				return false;	
			}
 
			// Read the surface desc
			if (is.read(m_tmpBuffer, 0, 124) != 124)
			{
				Log.e("TEXTURECOMPRESSION", "TEXTURECOMPRESSION: DDS file too short for surface desc");
				is.close();
				return false;
			}
			ByteBuffer wrapped = ByteBuffer.wrap(m_tmpBuffer); // big-endian by default 
			wrapped.order(ByteOrder.LITTLE_ENDIAN);
			int height      = wrapped.getInt(8 );
			int width       = wrapped.getInt(12 );
			int linearSize  = wrapped.getInt(16 );
			int mipMapCount = wrapped.getInt(24 );
			int fourCC      = wrapped.getInt(80 );
 
			Log.e("TEXTURECOMPRESSION", "TEXTURECOMPRESSION: DDS file width=" + width + " height=" + height+ " fourCC=" + fourCC );
 
 
			// Guess the internal format thanks to the FourCC
			int components  = (fourCC == FOURCC_DXT1) ? 3 : 4;
			int format;
			switch(fourCC)
			{
			case FOURCC_DXT1:
				format = GL_COMPRESSED_RGBA_S3TC_DXT1_EXT;
				break;
			case FOURCC_DXT3:
				format = GL_COMPRESSED_RGBA_S3TC_DXT3_EXT;
				break;
			case FOURCC_DXT5:
				format = GL_COMPRESSED_RGBA_S3TC_DXT5_EXT;
				break;
			case FOURCC_ATCA:
				format = ATC_RGBA_EXPLICIT_ALPHA_AMD;
				break;
 
			case FOURCC_PTC4:
				format = COMPRESSED_RGBA_PVRTC_4BPPV1_IMG;
				break;
 
			default:
				Log.e("TEXTURECOMPRESSION", "TEXTURECOMPRESSION: Unsupported FourCC type");
				is.close();
				return false;	
			}
 
 
			// create the new texture
			int[] tex_out = new int[1];
			gl.glGenTextures(1, tex_out, 0);
 
			m_textureID = tex_out[0];
			gl.glBindTexture(GL10.GL_TEXTURE_2D, m_textureID);
 
			gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_S, m_textureWrapS);
			gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_WRAP_T, m_textureWrapT);
			gl.glTexEnvf(GL10.GL_TEXTURE_ENV, GL10.GL_TEXTURE_ENV_MODE, GL10.GL_MODULATE);
 
			try
			{
				if (m_textureFiltering == TextureFiltering.TextureFilteringNearest)
				{
					gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_NEAREST); 
					gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_NEAREST);
					glTexImage2D(GL10.GL_TEXTURE_2D, 0, m_bitmap, 0, gl);
				}
				else if (m_textureFiltering == TextureFiltering.TextureFilteringLinear)
				{
					gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_LINEAR); 
					gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR);
				}
				else if (m_textureFiltering == TextureFiltering.TextureFilteringMipMap)
				{
					gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_LINEAR_MIPMAP_LINEAR); // GL_LINEAR_MIPMAP_NEAREST
					gl.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR);
 
				}
			}
			catch(Exception e)
			{
				Log.e("LoadBitmap", "Texture Load Error: ", e);
 
				m_textureID = -1;
				UnloadBitmap();
 
				return false;
			}
 
			// The block-size is 8 bytes for DXT1, BC1, and BC4 formats, and 16 bytes for 
			// other block-compressed formats.
			// ATI1N (BC4/DXT5A), ATI2N (BC5/3Dc)
			// http://msdn.microsoft.com/en-us/library/windows/desktop/bb943991%28v=vs.85%29.aspx
			int blockSize = (format == GL_COMPRESSED_RGBA_S3TC_DXT1_EXT) ? 8 : 16;
			int offset = 0;
 
			/* load the mipmaps */
			for (int level = 0; level < mipMapCount && width > 0 && height > 0; ++level)
			{
				// COmpute the size of the mimmap level texture
				// http://msdn.microsoft.com/en-us/library/windows/desktop/bb943991%28v=vs.85%29.aspx
				int size = ((width+3)/4)*((height+3)/4)*blockSize;
 
				// Can be optimized by making a single allocation
				byte tmpBuffer2[] = null;
				tmpBuffer2 = new byte[size];
				if (is.read(tmpBuffer2, 0, size) != size)
				{
					Log.e("TEXTURECOMPRESSION", "TEXTURECOMPRESSION: DDS file too short for texture data");
					is.close();
					return false;
				}
 
				// Create the texture level
				gl.glCompressedTexImage2D(GL11.GL_TEXTURE_2D, level, format, width, height, 
											0, size, ByteBuffer.wrap( tmpBuffer2));
 
				offset += size;
				width  /= 2;
				height /= 2;
			}
 
			is.close();
 
			Log.i("TEXTURECOMPRESSION", "TEXTURECOMPRESSION Success !!!");
 
			return true;
		}
		catch (Exception e)
		{
			Log.e("TEXTURECOMPRESSION", "TEXTURECOMPRESSION: Failed to parse DDS file", e);
			return false;
		}
	}
 
}

The internal formats of each compressed format can be found on the khronos website. The fourCC can be simply found by editing your DDS files with an Hexadecimal editor.

 

Impact on performances

I measured the impact on the performances at a location where the framerate used to drop and I obtained the following results.

Transformer Xperia L OUYA
High RGB32 Medium DTX Low RGB 565 High RGB32 Medium ATI Low RGB 565 High RGB32 Medium DTX Low RGB 565
Framerate 20,5 29,8 34 40,5 46 44 34 40 50

These results show that texture compression clearly improve the performances. We can also see that loading textures without compression but in RGB565 gives the best framerates.

Conclusion

As a conclusion, Texture compression improves the performances but can have an impact on the texture quality. It is the same with the alternative of using textures in RGB565 format.
As a result, you should select carefully which textures should be compressed and which textures should remain in RGB32.