Skip to content

Commit 6293107

Browse files
authored
Fix sRGB alpha rounding, and linear decode_unorm8 rounding (#448)
This change ensures that the compressor correctly handles sRGB alpha component values, which decompress using decode_fp16 rules, and linear LDR when writing to an 8-bit output which decompress using decode_unorm8 rules.
1 parent cef5102 commit 6293107

21 files changed

Lines changed: 1076 additions & 676 deletions

Docs/ChangeLog-4x.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,31 @@ release of the 4.x series.
66
All performance data on this page is measured on an Intel Core i5-9600K
77
clocked at 4.2 GHz, running `astcenc` using AVX2 and 6 threads.
88

9+
<!-- ---------------------------------------------------------------------- -->
10+
## 4.7.0
11+
12+
**Status:** TBD
13+
14+
The 4.7.0 release is a maintenance release.
15+
16+
* **General:**
17+
* **Bug fix:** sRGB LDR decompression now uses correct `decode_fp16` decode
18+
mode rounding rules for the alpha channel.
19+
* **Bug fix:** Linear LDR decompression now uses correct `decode_unorm8`
20+
decode mode rounding rules when writing to an 8-bit output image.
21+
* **Feature:** Library configuration supports a new flag,
22+
`ASTCENC_FLG_USE_DECODE_UNORM8`. This flag indicates that the image will be
23+
used with the `decode_unorm8` decode mode. When set during compression
24+
this allows the compressor to use the correct rounding when determining the
25+
best encoding.
26+
* **Feature:** Command line tool supports a new option, `-decode_unorm8`.
27+
This option indicates that the image will be used with the `decode_unorm8`
28+
decode mode. This option will automatically be set for decompression
29+
(`-d*`) and trial (`-t*`) tool operation if the decompressed output image
30+
is stored to an 8-bit per component file format. This option must be set
31+
maually for compression (`-c*`) tool operation, as the desired decode mode
32+
cannot be reliably determined.
33+
934
<!-- ---------------------------------------------------------------------- -->
1035
## 4.6.1
1136

Docs/Encoding.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,29 @@ signed endpoint mode.
133133
This section outlines some of the other things to consider when encoding
134134
textures using ASTC.
135135

136+
## Decode mode extensions
137+
138+
ASTC is specified to decompress into a 16-bit per component RGBA output by
139+
default, with the exception of the sRGB format which uses an 8-bit value for the
140+
RGB components.
141+
142+
Decompressing in to a 16-bit per component output format is often higher than
143+
many use cases require, especially for LDR textures which originally came from
144+
an 8-bit per component source image. Most implementations of ASTC support the
145+
decode mode extensions, which allow an application to opt-in to a lower
146+
precision decompressed format (RGBA8 for LDR, RGB9E5 for HDR). Using these
147+
extensions can improve GPU texture cache efficiency, and even improve texturing
148+
filtering throughput, for use cases that do not need the higher precision.
149+
150+
The ASTC format uses different data rounding rules when the decode mode
151+
extensions are used. To ensure that the compressor chooses the best encodings
152+
for the RGBA8 rounding rules, you can specify `-decode_unorm8` when compressing
153+
textures that will be decompressed into the RGBA8 intermediate. This gives a
154+
small image quality boost.
155+
156+
**Note:** This mode is automatically enabled if you use the `astcenc`
157+
decompressor to write an 8-bit per component output image.
158+
136159
## Encoding non-correlated components
137160

138161
Most other texture compression formats have a static component assignment in
@@ -209,4 +232,4 @@ which will treat all components as HDR data.
209232

210233
- - -
211234

212-
_Copyright © 2019-2022, Arm Limited and contributors. All rights reserved._
235+
_Copyright © 2019-2024, Arm Limited and contributors. All rights reserved._

README.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# About
22

33
The Arm® Adaptive Scalable Texture Compression (ASTC) Encoder, `astcenc`, is
4-
a command-line tool for compressing and decompressing images using the ASTC
4+
a command-line tool for compressing and decompressing images using the ASTC
55
texture compression standard.
66

77
## The ASTC format
@@ -33,7 +33,7 @@ dynamic range (BMP, PNG, TGA), high dynamic range (EXR, HDR), or DDS and KTX
3333
wrapped output images.
3434

3535
The encoder allows control over the compression time/quality tradeoff with
36-
`exhaustive`, `verythorough`, `thorough`, `medium`, `fast`, and `fastest`
36+
`exhaustive`, `verythorough`, `thorough`, `medium`, `fast`, and `fastest`
3737
encoding quality presets.
3838

3939
The encoder allows compression time and quality analysis by reporting the
@@ -145,6 +145,11 @@ The modes available are:
145145
* `-ch` : use the HDR color profile, tuned for HDR RGB and LDR A.
146146
* `-cH` : use the HDR color profile, tuned for HDR RGBA.
147147

148+
If you intend to use the resulting image with the decode mode extensions to
149+
limit the decompressed precision to UNORM8, it is recommended that you also
150+
specify the `-decode_unorm8` flag. This will ensure that the compressor uses
151+
the correct rounding rules when choosing encodings.
152+
148153
## Decompressing an image
149154

150155
Decompress an image using the `-dl` \ `-ds` \ `-dh` \ `-dH` modes. For example:
@@ -231,7 +236,7 @@ or general mobile graphics development or technology please submit them on the
231236

232237
- - -
233238

234-
_Copyright © 2013-2023, Arm Limited and contributors. All rights reserved._
239+
_Copyright © 2013-2024, Arm Limited and contributors. All rights reserved._
235240

236241
[1]: ./Docs/FormatOverview.md
237242
[2]: https://www.khronos.org/registry/DataFormat/specs/1.3/dataformat.1.3.html#ASTC

Source/UnitTest/cmake_core.cmake

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,21 +15,33 @@
1515
# under the License.
1616
# ----------------------------------------------------------------------------
1717

18-
1918
set(ASTCENC_TEST test-unit-${ASTCENC_ISA_SIMD})
2019

2120
add_executable(${ASTCENC_TEST})
2221

22+
# Enable LTO under the conditions where the codec library will use LTO.
23+
# The library link will fail if the settings don't match
24+
if(${ASTCENC_CLI})
25+
set_property(TARGET ${ASTCENC_TEST}
26+
PROPERTY
27+
INTERPROCEDURAL_OPTIMIZATION_RELEASE True)
28+
endif()
29+
2330
target_sources(${ASTCENC_TEST}
2431
PRIVATE
2532
test_simd.cpp
2633
test_softfloat.cpp
34+
test_decode.cpp
2735
../astcenc_mathlib_softfloat.cpp)
2836

2937
target_include_directories(${ASTCENC_TEST}
3038
PRIVATE
3139
${gtest_SOURCE_DIR}/include)
3240

41+
target_link_libraries(${ASTCENC_TEST}
42+
PRIVATE
43+
astcenc-${ASTCENC_ISA_SIMD}-static)
44+
3345
target_compile_options(${ASTCENC_TEST}
3446
PRIVATE
3547
# Use pthreads on Linux/macOS

Source/UnitTest/test_decode.cpp

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
// SPDX-License-Identifier: Apache-2.0
2+
// ----------------------------------------------------------------------------
3+
// Copyright 2023 Arm Limited
4+
//
5+
// Licensed under the Apache License, Version 2.0 (the "License"); you may not
6+
// use this file except in compliance with the License. You may obtain a copy
7+
// of the License at:
8+
//
9+
// http://www.apache.org/licenses/LICENSE-2.0
10+
//
11+
// Unless required by applicable law or agreed to in writing, software
12+
// distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
13+
// WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
14+
// License for the specific language governing permissions and limitations
15+
// under the License.
16+
// ----------------------------------------------------------------------------
17+
18+
/**
19+
* @brief Unit tests for the vectorized SIMD functionality.
20+
*/
21+
22+
#include <limits>
23+
24+
#include "gtest/gtest.h"
25+
26+
#include "../astcenc.h"
27+
28+
namespace astcenc
29+
{
30+
31+
/** @brief Test harness for exploring issue #447. */
32+
TEST(decode, decode12x12)
33+
{
34+
astcenc_error status;
35+
astcenc_config config;
36+
astcenc_context* context;
37+
38+
static const astcenc_swizzle swizzle {
39+
ASTCENC_SWZ_R, ASTCENC_SWZ_G, ASTCENC_SWZ_B, ASTCENC_SWZ_A
40+
};
41+
42+
uint8_t data[16] {
43+
#if 0
44+
0x84,0x00,0x38,0xC8,0x00,0x00,0x00,0x00,
45+
0x00,0x00,0x00,0x00,0x00,0xB3,0x4D,0x78
46+
#else
47+
0x29,0x00,0x1A,0x97,0x01,0x00,0x00,0x00,
48+
0x00,0x00,0x00,0x00,0x00,0xCF,0x97,0x86
49+
#endif
50+
};
51+
52+
uint8_t output[12*12*4];
53+
astcenc_config_init(ASTCENC_PRF_LDR, 12, 12, 1, ASTCENC_PRE_MEDIUM, 0, &config);
54+
55+
status = astcenc_context_alloc(&config, 1, &context);
56+
EXPECT_EQ(status, ASTCENC_SUCCESS);
57+
58+
astcenc_image image;
59+
image.dim_x = 12;
60+
image.dim_y = 12;
61+
image.dim_z = 1;
62+
image.data_type = ASTCENC_TYPE_U8;
63+
uint8_t* slices = output;
64+
image.data = reinterpret_cast<void**>(&slices);
65+
66+
status = astcenc_decompress_image(context, data, 16, &image, &swizzle, 0);
67+
EXPECT_EQ(status, ASTCENC_SUCCESS);
68+
69+
for (int y = 0; y < 12; y++)
70+
{
71+
for (int x = 0; x < 12; x++)
72+
{
73+
uint8_t* pixel = output + (12 * 4 * y) + (4 * x);
74+
printf("[%2dx%2d] = %03d, %03d, %03d, %03d\n", x, y, pixel[0], pixel[1], pixel[2], pixel[3]);
75+
}
76+
}
77+
}
78+
79+
}

Source/astcenc.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,8 @@ enum astcenc_error {
215215
ASTCENC_ERR_BAD_CONTEXT,
216216
/** @brief The call failed due to unimplemented functionality. */
217217
ASTCENC_ERR_NOT_IMPLEMENTED,
218+
/** @brief The call failed due to an out-of-spec decode mode flag set. */
219+
ASTCENC_ERR_BAD_DECODE_MODE,
218220
#if defined(ASTCENC_DIAGNOSTICS)
219221
/** @brief The call failed due to an issue with diagnostic tracing. */
220222
ASTCENC_ERR_DTRACE_FAILURE,
@@ -312,6 +314,19 @@ enum astcenc_type
312314
*/
313315
static const unsigned int ASTCENC_FLG_MAP_NORMAL = 1 << 0;
314316

317+
/**
318+
* @brief Enable compression heuristics that assume use of decode_unorm8 decode mode.
319+
*
320+
* The decode_unorm8 decode mode rounds differently to the decode_fp16 decode mode, so enabling this
321+
* flag during compression will allow the compressor to use the correct rounding when selecting
322+
* encodings. This will improve the compressed image quality if your application is using the
323+
* decode_unorm8 decode mode, but will reduce image quality if using decode_fp16.
324+
*
325+
* Note that LDR_SRGB images will always use decode_unorm8 for the RGB channels, irrespective of
326+
* this setting.
327+
*/
328+
static const unsigned int ASTCENC_FLG_USE_DECODE_UNORM8 = 1 << 1;
329+
315330
/**
316331
* @brief Enable alpha weighting.
317332
*
@@ -378,6 +393,7 @@ static const unsigned int ASTCENC_ALL_FLAGS =
378393
ASTCENC_FLG_MAP_RGBM |
379394
ASTCENC_FLG_USE_ALPHA_WEIGHT |
380395
ASTCENC_FLG_USE_PERCEPTUAL |
396+
ASTCENC_FLG_USE_DECODE_UNORM8 |
381397
ASTCENC_FLG_DECOMPRESS_ONLY |
382398
ASTCENC_FLG_SELF_DECOMPRESS_ONLY;
383399

Source/astcenc_color_unquantize.cpp

Lines changed: 40 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
// SPDX-License-Identifier: Apache-2.0
22
// ----------------------------------------------------------------------------
3-
// Copyright 2011-2021 Arm Limited
3+
// Copyright 2011-2023 Arm Limited
44
//
55
// Licensed under the Apache License, Version 2.0 (the "License"); you may not
66
// use this file except in compliance with the License. You may obtain a copy
@@ -894,32 +894,55 @@ void unpack_color_endpoints(
894894
}
895895
}
896896

897-
vint4 ldr_scale(257);
898-
vint4 hdr_scale(1);
899-
vint4 output_scale = ldr_scale;
897+
// Handle endpoint errors and expansion
900898

901-
// An LDR profile image
902-
if ((decode_mode == ASTCENC_PRF_LDR) ||
903-
(decode_mode == ASTCENC_PRF_LDR_SRGB))
899+
// Linear LDR 8-bit endpoints are expanded to 16-bit by replication
900+
if (decode_mode == ASTCENC_PRF_LDR)
904901
{
905-
// Also matches HDR alpha, as cannot have HDR alpha without HDR RGB
906-
if (rgb_hdr == true)
902+
// Error color - HDR endpoint in an LDR encoding
903+
if (rgb_hdr || alpha_hdr)
907904
{
908-
output0 = vint4(0xFF00, 0x0000, 0xFF00, 0xFF00);
909-
output1 = vint4(0xFF00, 0x0000, 0xFF00, 0xFF00);
910-
output_scale = hdr_scale;
905+
output0 = vint4(0xFF, 0x00, 0xFF, 0xFF);
906+
output1 = vint4(0xFF, 0x00, 0xFF, 0xFF);
907+
rgb_hdr = false;
908+
alpha_hdr = false;
909+
}
911910

911+
output0 = output0 * 257;
912+
output1 = output1 * 257;
913+
}
914+
// sRGB LDR 8-bit endpoints are expanded to 16 bit by:
915+
// - RGB = shift left by 8 bits and OR with 0x80
916+
// - A = replication
917+
else if (decode_mode == ASTCENC_PRF_LDR_SRGB)
918+
{
919+
// Error color - HDR endpoint in an LDR encoding
920+
if (rgb_hdr || alpha_hdr)
921+
{
922+
output0 = vint4(0xFF, 0x00, 0xFF, 0xFF);
923+
output1 = vint4(0xFF, 0x00, 0xFF, 0xFF);
912924
rgb_hdr = false;
913925
alpha_hdr = false;
914926
}
927+
928+
vmask4 mask(true, true, true, false);
929+
930+
vint4 output0rgb = lsl<8>(output0) | vint4(0x80);
931+
vint4 output0a = output0 * 257;
932+
output0 = select(output0a, output0rgb, mask);
933+
934+
vint4 output1rgb = lsl<8>(output1) | vint4(0x80);
935+
vint4 output1a = output1 * 257;
936+
output1 = select(output1a, output1rgb, mask);
915937
}
916-
// An HDR profile image
938+
// An HDR profile decode, but may be using linear LDR endpoints
939+
// Linear LDR 8-bit endpoints are expanded to 16-bit by replication
940+
// HDR endpoints are already 16-bit
917941
else
918942
{
919943
vmask4 hdr_lanes(rgb_hdr, rgb_hdr, rgb_hdr, alpha_hdr);
920-
output_scale = select(ldr_scale, hdr_scale, hdr_lanes);
944+
vint4 output_scale = select(vint4(257), vint4(1), hdr_lanes);
945+
output0 = output0 * output_scale;
946+
output1 = output1 * output_scale;
921947
}
922-
923-
output0 = output0 * output_scale;
924-
output1 = output1 * output_scale;
925948
}

Source/astcenc_compress_symbolic.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
// SPDX-License-Identifier: Apache-2.0
22
// ----------------------------------------------------------------------------
3-
// Copyright 2011-2023 Arm Limited
3+
// Copyright 2011-2024 Arm Limited
44
//
55
// Licensed under the Apache License, Version 2.0 (the "License"); you may not
66
// use this file except in compliance with the License. You may obtain a copy
@@ -1237,6 +1237,8 @@ void compress_block(
12371237
vfloat4 color_f32 = clamp(0.0f, 1.0f, blk.origin_texel) * 65535.0f;
12381238
vint4 color_u16 = float_to_int_rtn(color_f32);
12391239
store(color_u16, scb.constant_color);
1240+
1241+
// TODO: Check this encodes correctly for decode_unorm8
12401242
}
12411243

12421244
trace_add_data("exit", "quality hit");

0 commit comments

Comments
 (0)