Digilib is graphical library written in C, developed by Adam Herout and Pavel Zemèík. Aim of this project is to create set of functions, running on GPU to accelerate image processing. It should be almost transparent to program, using digilib (almost means there is possibility of some additional optimalizations and of course a few initialization functions)
See digilib homepage.
status: complete, yet to debug CPU reference routines and FFT and improve expression evaluator (operation scheduling, temp resource allocation)
language: C / C++ (C interface and C++ engine)
os: os independent
First real results! All the operations are now working on GPU, their CPU equivalents are worked on, day and night. For now, you can see how fast this is going to be:
operation | time @ 256 x 256 8bit RGBA | fillrate |
---|---|---|
ImageImageAdd(p_dest, p_src1, p_src2) | 0.105581 ms | 620.720594 Mpix/s |
ImageImageMult(p_dest, p_src1, p_src2) | 0.104784 ms | 625.440271 Mpix/s |
ImageImageSub(p_dest, p_src1, p_src2) | 0.105940 ms | 618.613186 Mpix/s |
ImageImageDiv(p_dest, p_src1, p_src2) | 0.118815 ms | 551.581884 Mpix/s |
ImageImageBlend(p_dest, p_src1, p_src2) | 0.108930 ms | 601.635003 Mpix/s |
ImageImageDecal(p_dest, p_src1, p_src3) | 0.106401 ms | 615.935221 Mpix/s |
ImageInvert(p_dest, p_src1) | 0.105159 ms | 623.208359 Mpix/s |
ImageAbs(p_dest, p_src1) | 0.104733 ms | 625.745922 Mpix/s |
ImageImageMin(p_dest, p_src1, p_src2) | 0.106247 ms | 616.824894 Mpix/s |
ImageImageMax(p_dest, p_src1, p_src2) | 0.106563 ms | 614.997094 Mpix/s |
ImageThresh(p_dest, p_src1, .5f) | 0.106709 ms | 614.156536 Mpix/s |
ImageLog(p_dest, p_src1) | 0.106350 ms | 616.229134 Mpix/s |
ImageExp(p_dest, p_src1) | 0.106403 ms | 615.920339 Mpix/s |
ImageSqrt(p_dest, p_src1) | 0.148221 ms | 442.150207 Mpix/s |
ImagePow(p_dest, p_src1, 2.0f) | 0.149427 ms | 438.582068 Mpix/s |
ImageScale(p_dest, p_src2, 1.5f, -.5f) | 0.108324 ms | 604.999379 Mpix/s |
ImageColorMatrix(p_dest, p_src2, p_mat4x4, false) | 0.108516 ms | 603.927137 Mpix/s |
ImageGreyscale(p_dest, p_src2) | 0.105775 ms | 619.577177 Mpix/s |
ImageResize(p_dest, p_src2) | 1.182514 ms | 55.420917 Mpix/s |
ImageRotate(p_dest, p_src2, 30deg, false) | 1.134724 ms | 57.755021 Mpix/s |
ImageTransform(p_dest, p_src2, ...) | 1.073847 ms | 61.029151 Mpix/s |
ImageSquareErode(p_dest, p_src1, 8, 8) | 8.859968 ms | 7.396867 Mpix/s |
ImageSquareDilate(p_dest, p_src1, 8, 8) | 9.873257 ms | 6.637729 Mpix/s |
ImageSquareOpen(p_dest, p_src1, 3, 3) | 3.776635 ms | 17.353014 Mpix/s |
ImageSquareClose(p_dest, p_src1, 3, 3) | 3.771434 ms | 17.376946 Mpix/s |
ImageDiamondErode(p_dest, p_src1, 8, 8) | 9.608934 ms | 6.820319 Mpix/s |
ImageDiamondDilate(p_dest, p_src1, 8, 8) | 10.509161 ms | 6.236083 Mpix/s |
ImageDiamondOpen(p_dest, p_src1, 3, 3) | 3.552302 ms | 18.448883 Mpix/s |
ImageDiamondClose(p_dest, p_src1, 3, 3) | 3.551453 ms | 18.453292 Mpix/s |
ImageImageErode(p_dest, p_src1, p_src4) | 10.521133 ms | 6.228987 Mpix/s |
ImageImageDilate(p_dest, p_src1, p_src4) | 10.490948 ms | 6.246909 Mpix/s |
ImageImageOpen(p_dest, p_src1, p_src4) | 20.963980 ms | 3.126124 Mpix/s |
ImageImageClose(p_dest, p_src1, p_src4) | 20.952363 ms | 3.127857 Mpix/s |
ImageImageHitMiss(p_dest, p_src1, p_src5) | 2.403412 ms | 27.267904 Mpix/s |
ImageMean(p_dest, p_src2, 256, 256) | 53.493827 ms | 1.225113 Mpix/s |
ImageMedian(p_dest, p_src1, 15) | 8.967916 ms | 7.307829 Mpix/s |
ImageLocalMin(p_dest, p_src2, 8, 8) | 1.724755 ms | 37.997284 Mpix/s |
ImageLocalMax(p_dest, p_src2, 8, 8) | 1.724741 ms | 37.997599 Mpix/s |
ImageConvolveSeparable(p_dest, p_src2, G16, G16) | 4.186004 ms | 15.655981 Mpix/s |
ImageConvolve2D(p_dest, p_src2, Gauss16) | 38.650736 ms | 1.695595 Mpix/s |
ImageGetRGBAHistogram(p_src1, 256, ...) | 14.407298 ms | 4.548806 Mpix/s |
ImageGetMinMax(p_src1, p_minmax) | 6.911876 ms | 9.481652 Mpix/s |
version | changes from previous version |
---|---|
v0 | - |
v1 | now it works |
v1.01 | updated to work with lame r17 and under linux, fixed new (163.71) NV drivers bug |
Every ImageStruct has it's assigned texture. (assignments based on ImageStruct address) Those asignments are created when first doing some operation with a given image. Operations may fail in case it was unable to create texture to hold the image (max texture size limit, image format limit)
Every operation is represented by some shader (almost every, some basic operations can be taken care of just by opengl blending or image processing subset) which is loaded in the moment it's necessary (i.e. when calling some operation first time, it may take more time because shader is being compiled)
Every operation (by default) uploads source images from system memory to texture memory, renders to dest image texture and transfer it to system memory. Such a behavior can be disabled by calling:
void
Disable_AutoDownloads()
void
Disable_AutoUploads()
Where download means transfer from texture memory to system memory and upload vice versa. In case the source image has no texture associated with it, it's going to be transferred anyway. There are two complementar Enable_* functions as well. There can be situation it's necessary to upload just some images. It can be done either manually by calling:
bool
Upload_ImageStruct(const
ImageStruct *p_image);
bool
Download_ImageStruct(const
ImageStruct *p_image);
bool
Upload_Async_ImageStruct(const
ImageStruct *p_image, int
*p_fence_id);
bool
Download_Async_ImageStruct(const
ImageStruct *p_image, int
*p_fence_id);
(note p_fence_id is output parameter, it will contain id of fence, i.e. OpenGL object that can be used to query transfer completeness)
Or by setting "dirty flag". Every image has dirty flag as well. It's a bit, telling wheter image is up-to-date in system memory or in texture memory. Normally those flags are automatically set by image operations, uploads and downloads. In case it's necessary it can be done manualy using function:
void
Set_DirtyFlag(const
ImageStruct *p_image, bool
b_current_on_server);
In case parameter b_current_on_server is true, it means image is up-to-date on server-side (texture memory). In case it's false, it's up-to-date in system memory. Image uploads can be triggered by invalidating the texture (calling with parameter b_current_on_server = false). For now it doesn't affect image downloading as the system memory version of image is marked outdated (dirty flag bit is set high) by image processing functions. Manual image transfer functions doesn't read the flag, they only set it.
#include
<stdio.h> // fprintf
#include
<image.h> // ImageStruct
#include
"DigiLib_Ext.h" // GPU functions
Init_OpenGL();
// you can use this function from ÜberLame or use GLUT
if
(!b_FramebufferObjectSupported()) {
fprintf(stderr, "error: framebuffer objects not supported, "
"unable to process images\n");
return
-1;
}
if
(!b_NPOTTexturesSupported())
fprintf(stderr, "warning: non-power-of-two textures not supported\n");
if
(!b_FloatTexturesSupported())
fprintf(stderr, "warning: float textures not supported\n");
// check OpenGL capabilities
// from now on, GPU image processing is transparent
ImageStruct *a, *b, *c; // filled by some data
ImageImageAdd(a, a, b); // a = a + b
ImageImageAdd(a, a, c); // a = a + c
// calculate image sum into a (example image processing)
// image processing end
Free_OpenGL_Objects();
Shutdown_OpenGL();
// cleanup, shutdown opengl
It's kind of dumb, because when calculating a = a + b, image data of a and b are copied to textures, then adding shader is executed and the result is copied back to image a. Then, in next step images a and c are copied to textures (texture for image a actualy contains up-to-date image), images are added and data of image a is downloaded back to a. We could save one image upload and image download here.
The simplest sollution would be to alloc images using
ImageStruct *p_Create_GL_Image(int
n_width, int
n_height, short
n_format);
which creates imagestruct in graphics card memory, mapped to system memory which should save most of unnecessary copying on PCI-X systems. Note memory mapping has to be repeated after every operation with the image (no matter wheter source or destination image) and so it's internal data pointer may change. This will make no difference on AGP systems.
Another, more complicated way is enabling and disabling transfers as follows:
#include
<stdio.h> // fprintf
#include
<image.h> // ImageStruct
#include
"DigiLib_Ext.h" // GPU functions
Init_OpenGL();
// you can use this function from ÜberLame or use GLUT
if
(!b_FramebufferObjectSupported()) {
fprintf(stderr, "error: framebuffer objects not supported, "
"unable to process images\n");
return
-1;
}
if
(!b_NPOTTexturesSupported())
fprintf(stderr, "warning: non-power-of-two textures not supported\n");
if
(!b_FloatTexturesSupported())
fprintf(stderr, "warning: float textures not supported\n");
// check OpenGL capabilities
// from now on, GPU image processing is transparent
ImageStruct *a, *b, *c; // filled by some data
Disable_AutoDownloads(); // disables image downloading
Disable_AutoUploads(); // disables uploading of up-to-date images
ImageImageAdd(a, a, b); // a = a + b
// a and b weren't uploaded before so they will be uploaded now
// downloads are disabled so a will not be downloaded
Enable_AutoDownloads(); // re-enables image downloading
ImageImageAdd(a, a, c); // a = a + c
// a was uploaded before and it's associated texture contains
// fresh data so it won't be uploaded now
// c wasen't uploaded before so it will be uploaded now
// downloads are enabled and a will be downloaded
Download_ImageStruct(a);
// just another way of getting image a to system memory
// image processing end
Free_OpenGL_Objects();
Shutdown_OpenGL();
// cleanup, shutdown opengl
version | release date | file | release notes |
---|---|---|---|
v0 | 2006-02-02 | digilib_gpu_ext_v000.zip | incomplete. see 'DigiLib_Ext.h', you can send me your comments |
v0 | 2006-02-05 | digilib_gpu_ext_v010.zip | vs2005 version (still not functional) |
v0 | 2006-03-20 | digilib_gpu_ext_v020.zip | "cranes" version (ops on images from avi file) |
v0 | 2006-05-22 | digilib_gpu_ext_v030.zip | first working release version |
v1 | 2007-10-10 | digilib_gpu_ext_v100.zip | complete distribution with extensive documentation |
v1.01 | 2007-11-26 | digilib_gpu_ext_v101.zip | source code update (lame r17, linux, new (163.71) NV drivers) |