Paper Title (use style: paper title)

Viewer
Transcript

Real-Time Visualization of Virtual Huge Texture

Zeyu Li, Hui Li, Anxiang Zeng, Lian Wang, Yongwen Wang Inst. of Graphics and Image, School of Computer Sichuan University Chengdu, China [email protected], [email protected]

Abstract—Huge texture is an important component in many civil and military simulations. These textures are often too large to fit into video memory, even system memory, causing a bottleneck. In this paper, we present an efficient large texture management technique which is inspired by clipmap[11]. The clipmap algorithm caches a subset of the texture mipmap pyramid in video memory and updates incrementally based on toroidal mapping. To date, it is not supported on most GPU. We provide an approach to implement it on the recent generation of commodity GPUs using programmable shaders, and has some supplement or improvement in data organization, rendering quality, efficiency, etc. The experiment shows the high efficient and high quality of this algorithm by visualizing a 216×215 virtual texture. Keywords-virtual texture; clipmap; texture cache; visualization; level of detail

I.

terrain

INTRODUCTION

Texture mapping plays a key role in real-time graphical applications, adding realism to scenes. For large database area visualization such as GIS, there is little work focus on the big amount of texture data comparing to a bunch of geometry management algorithms. The problem arises because these textures are often extremely huge satellite images which far exceed the video memory, even the system memory, and the modern graphics hardware is limited to display a single texture of size up to 4096×4096 or 8192× 8192. Therefore the texture dispatching and paging strategies are very essential. It is really a big challenge to develop a real-time huge texture mapping algorithm for large-scale surface with very high quality. In this paper, we propose a new technique that can efficiently handle a huge texture with the current commodity graphics hardware taking advantage of programmable pipeline. It is an incremental texture management algorithm which is devised based on two-level cache hierarchy including a Tile Array stored in the video memory as the first level and a set of tile data stored in the system memory from external storage as the second level. When the camera moves, the tile data are first paged into system memory, then moved into video memory to update the contents of the Tile Array, taking frame-to-frame coherence into consideration for saving bandwidth consumption.

II.

PREVIOUS WORK

Although large texture mapping is an important part of the huge surface visualization, most papers talk little about this subject. The typical technique for handling large texture is to split it into tiles that can be directly supported by the graphics hardware, binding each tile to a certain part of the surface [6]. In some methods, tiles are arranged into quadtree structure [2; 8]. Döllner [4] introduced a texture tree containing a set of texture patches, each of which is associated with a geometry patch. Hwa [7] used 4-8 hierarchy on both textures and the geometry, requiring the textures to be rotated, a costly update of vertex texture coordinates has to be performed. Seoane [10] implemented clipmap without specific hardware with fixed function pipeline, it requires that each "geometry set" is mapped to a single texture, which is not independence between textures and geometry. All of the algorithms above, the texture handling is tightly coupled with the geometry. The truly solution to the previous limitation is the clipmap [11], which requires the special hardware. It caches a subset of the texture mipmap pyramid in video memory and updates incrementally based on toroidal mapping. Our algorithm is inspired by this idea, and adds some supplement or improvement to make the implementation easy, robust, high efficient and high quality. Ephanov [5] presented an approach called Virtual Texture, briefly talking about shader implementation without anisotropic filtering related. Crawfis [3] focused on shader support for clipmap, and referred to anisotropic filtering when selecting a level to render, adding extra computing and complexity in shader. We present a different, easy method to make full use of hardware build-in anisotropic filtering, providing a flexible load control and detailed description of how the texture dimension is unrestricted. III.

THE ALGORITHM

A. Overview The technique presented here is based on tiles too, and absorbs the basic concept of clipmap. It provides the following advantages: 1. The dimension of texture is only restricted by disk storage capacity and floating point bit depth. 2. The paging operation is totally transparent to user. 3. There is no dependence between geometry and texture.

4. It fully makes use of hardware build-in texture filtering, including anisotropic filtering to produce high quality results. 5. It consumes less bandwidth, less video memory and less disk storage to be high efficient by compressing texture tile into DXT1 format which is supported by almost every current graphics hardware. 6. It asynchronously fetches tile data from disk storage and is robust and flexible to control the frequency of update by using an intermediate texture, allowing updating a level through several frames. Fig. 1 depicts the architecture of the proposed algorithm. The preprocess program converts the original image to our virtual texture called "VT" which is as a source read by Texture Fetching Thread based on the view-parameters, the fetched tile data are piped into the Update Queue, waiting for the rendering thread to pop. At the beginning of each frame, the rendering thread checks the Update Queue to see whether there are data to update, if so, it transfers the data to the Intermediate Texture through several frames, which is then copied to the Cache Texture for rendering. B. Preprocess The process to create VT file consists of the following steps with the original image as the initial current level: 1. Extend the current level to a multiple of tile dimension if necessary by adding some padding. 2. Partition the current level to power-of-two dimension tiles in each of which the full mipmap chain will be generated with DXT1 compressing, and save tiles into VT file by sequence. 3. Generate the next level of the original image's mipmap chain to be the current level. 4. Repeat until the desired level is reached. The benefits of using the power-of-two tile are: 1. The graphics hardware works most effectively with power-of-two size. 2. The tiles with power-of-two dimension can be compressed with DXT1 without seams between them. By this way, the overall texture can be any size. The extra padding to each level only wastes some disk storage with little impact on performance. The final VT file whose structure is showed in Fig. 2 is opened once during the runtime.

Figure 1: Architecture of our algorithm

Figure 2: VT file structure

C. Data Organization In this section, we present a few concepts and the organization of data in our algorithm. Clip Region. Because of the limitation of the video memory storage and the fact that only a portion of each level is needed to render a frame of current observation, we just have to manage a dynamic subset of the texture pyramid like clipmap. Thus we define a Clip Region of n×n tiles(n is 4 in our current implementation), for each level, centered at the Uniform Viewpoint which represents a multiple of tile size just less than or equal to the exact viewpoint, then we only load the clip region into video memory. The set of clip regions from levels forms a Tile Array illustrated by Fig. 3. Each level has its own Uniform Viewpoint. What we need specifically mention is that there is an additional level called CoarseMap, not included in the Tile Array, covers the whole texture region, being completely stored in video memory with all the mipmap. We also define an Effective Region as a few tiles(2×2 tiles are chosen in our current implementation) around the Uniform Viewpoint. Once the exact viewpoint moves out of it, update is triggered and the current Uniform Viewpoint has to be modified, otherwise, there is no need to update the Tile Array. In this way, a small range jitter of viewpoint will not lead to frequently update. Because if ever the new viewpoint gives rise to the update of Tile Array, the current Uniform Viewpoint is also changed, so the jitter will not cause update again in this new Effective Region. Lookup Table. As already mentioned, each level owns a Uniform Viewpoint, so a simple, but not optimal, way to make sure which level is available is using branch selection for every pixel. To simplify this process in pixel shader, we exploit the Lookup Table accompanied with cached tile data. The Lookup Table is associated with the most detailed level (level 0), in which each tile is corresponding to a texel of the lookup table. Other levels cover more texels, in a word, level n covers 2n texels of the Lookup Table. Mip-maped Lookup Table is not required, because we do not want to select the appropriate level by ourselves when rendering. The Lookup Table is easily generated by the information of cached tiles, The value stored in each element of the Lookup Table contains the index of the related tile in the Tile Array. With the Lookup Table, only a single unfiltered texture lookup, in pixel shader, is needed to determine the tile data we are going to sample. Cache Texture and Intermediate Texture. The Cache Texture, consisting of the Tile Array, CoarseMap and the Lookup Table, resides in video memory all the time. It is the resource we need in the pixel shader and the destination of the Intermediate Texture, which also locates in video

memory. The Intermediate Texture is where the data in the system memory go when updating a level, which means one level's Clip Region and the related lookup data are sufficient for the Intermediate Texture. Because of the existence of this Intermediate Texture, we can control the update at runtime by adjusting a parameter which indicates how many tiles will be updated at most in one frame. Once the update of one level is finished, the copy from Intermediate Texture to Cache Texture is extremely fast within video memory. This strategy makes the GPU render a frame without waiting for the transfer of data from system memory to video memory, enhancing the flexibility and efficiency of the algorithm.

Figure 4: Reading of tile data when viewpoint moves

D. Update Strategy When the viewpoint moves out of the Effective Region, we need update the contents of the Cache Texture. To void replacing overlapped tiles, we adopt the toroidal addressing method, the same strategy used by clipmap. You can refer Fig. 4 and 5 for this process. The Lookup Table should also be updated when the Tile Array changes. E. Texture Fetching Thread The purpose of the Texture Fetching Thread is to load tile data needed for update from disk to memory. It is created when the algorithm initializes. Every time the thread occupies the processor, it checks each level to see whether the update is needed according to the current viewpoint. If it is, the thread is responsible for reading the needed tiles into the Update Queue, which is waiting for the rendering thread to pop. This multithreaded architecture can achieve more efficiency with multicore processor and can be easily extended to server-client application. F. Texture Filtering and Texture Compression As described above, every tile in the Tile Array has a full mipmap chain, so when we sample it, just like we do to normal texture, any hardware built-in filtering can be used. Because of the hardware support, DXT1 with 8:1 compression ratio is adopted to save video memory and bandwidth.

Figure 5: Memory layout after update

G. Storage Efficiency The amount of disk space used to store VT file is estimated with the equation: D≈w×h×b×4/3×4/3×1/8 bytes (1) where w and h represent the dimension of the virtual texture, b is the color depths in bytes per texel. 1/8 is added to the end of the equation due to the DXT1 compression. The system memory requirement is determined by the size of the Update Queue(n) and the tile dimension(t), following the formula: S≈t2×b×4/3×n×1/8 bytes (2) The video memory usage can be calculated as follows: V≈(t2 ×c×l + cm×cn)×b×4/3×1/8 + w/t×h/t×16 bytes (3) where c is the tile number of Clip Region, l is the number of levels in the Tile Array and CoarseMap dimension is denoted by cm and cn, the resolution of the Lookup Table is w/t×h/t. Some video memory requirements are tabulated in table 1, where 32-bit color depth is used, tile number of Clip Region is 16 and tile size is 256. Table 1: Video memory requirement

Figure 3: Clip Region and Tile Array

H. Implementation Currently, the algorithm is implemented in DX10. We chose 4 × 4 tiles as the Clip Region and 2 × 2 tiles as Effective Region around the Uniform Viewpoint. Texture

array, the new feature of DX10, is used to store the Tile Array. The Lookup Table is represented by a 128-bit texture format with 32-bit channels. At present, only one channel is utilized to store the related tile’s index, there are still three channels left and storing some blend information to make the transition between levels more smooth is possible. When

sampling a tile, clamp addressing mode is necessary. The pseudocode of pixel shader provided in Fig. 6 is simple and straightforward. If we partition the CoarseMap, push those tiles into Tile Array too, the code will be more concise and elegance, which in turn maybe makes the dimension of Tile Array bigger than limited value.

Figure 6: Pixel fragment used to fetch texture color on the GPU

IV.

RESULT AND DISCUSSION

The implementation has been tested on a personal computer powered by an AMD Athlon(tm) 64 X2 Dual Core processor 4400+ 2.21 GHz with 2G DDR RAM, an nVidia GeForce 8800GT chip with 512MB video memory and a SATA disk 7200 rpm. We test this algorithm by visualizing the earth surface, which is a simple model totally fitted into the video memory so that we can focus on the performance of the algorithm itself. All the tests are based on the screen resolution of 1024 ×768 and several virtual textures are included, varying from 216×215 to 214×213, all having a CoraseMap size of 4096× 2048 and a tile size of 256×256. The test is a flight over the equator with the speed of π/14400 radian per frame. Rendering result is illustrated in Fig. 7 and table 2 shows the test results, from which we can see that as the dimension of virtual texture grows, the frame rate slightly decreases, but the decline is insignificant comparing to the current

efficiency. This performance is enhanced by about 1.5 times when using the normal texture. It really should be because we add some extra work both on cpu and gpu. However, the test results still prove the high efficient of our algorithm, which is scarcely possible to be the bottleneck of real projects. Fig. 8 shows a real-time large-scale terrain using the proposed algorithm. Fig. 9 illustrates the multitexturing. Table 2: Test result

V.

CONCLUSIONS

We have presented a technique to manage very huge texture that the current graphics hardware can’t display directly. The result shows the efficient and high quality of our algorithm, which is easy to implement. The application of this technique is also presented. In the future, we hope to procedurally synthesize texture based on the existence data so that infinite detail can be rendered. Texture compression is another topic we’d like to put efforts on. Figure 7: Left part is the visualization of earth surface with 216×215 texels and the close-up view of rectangle region is displayed on the right.

Figure 8: Real-Time large-scale terrain rendering. Both heightmap and texture are regarded as virtual texture.

Figure 9: Multitexturing. Surface map and cloud map are applied to the earth visualization as virtual texture.

REFERENCES Barrett, S. Sparse Virtual Texture Memory. In Game Developer’s Conference, San Francisco, CA. [2] Cline, D. and Ebert, P.K. Interactive display of very large textures. In VIS ’98: Proceedings of the Conference on Visualization ’98, IEEE Computer Society Press, Los Alamitos, CA, pp. 343-350. [3] Crawfis, R., Noble, E., Ford, M., Kuck, F. and Wagner, E. Clipmapping on the GPU. ftp://ftp.cse.ohio-state.edu/pub/techreport/2007/TR24.pdf. [4] Döllner, J., Baumann, K. and Hinrichs, K. Texturing techniques for terrain visualization. In VIS ’00: Proceedings of the conference on Visualization ’00, IEEE Computer Society Press, pp. 227-324. [5] Ephanov, A. and Coleman, C. Virtual Texture: A Large Area Raster Resource for the GPU. In I/ITSEC ‘2006, pp. 645-656. [6] Hoppe, H. Smooth view-dependent level-of-detail control and its application to terrain rendering. In VIS ’98: Proceedings of the conference on Visualization ’98, IEEE Computer Society Press, Los Alamitos, CA, USA, pp. 35-42. [7] Hwa, L. M., Duchaineau, M. A. and joy, K. I. Adaptive 4-8 Texture hierarchies. In Proceedings of Visualization ’04, IEEE Computer Society, pp.219-226. [8] Kofler, M., Gervautz, M. and Gruber, M. The Styria flyover-LOD management for huge textured terrain models. In Proceedings of Computer Graphics International, IEEE Computer Society Press, pp.444-454. [9] Lefohn, A. E., Kniss, J., Strzodka, R, Sengupta, S. and Owens, J. D. Glift: Generic, Efficient, Random-Access GPU Data Structures. ACM Transactions on Graphics 25, 1, 2006, ACM Press, pp. 60-99. [10] Seoane, A., Taibo, J. and Hernández, L. Hardware-independent Clipmapping. In WSCG ’07, Czech Republic, pp. 177-183. [11] Tanner, C. C., Migdal, C. J. and jones, M. T. The clipmap: A virtual mipmap. In SIGGRAPH ‘98, ACM Press, pp. 151-158. [1]