Kernels for performing Box Filtering.
More...
|
| kernel void | boxFilterSAT (global float *sat, global float *out, int radius) |
| | Performs box (mean) filtering. More...
|
| |
| kernel void | boxFilterSAT_Tr (global float *sat, global float4 *out, local float *data, int radius, float scaling) |
| | Performs box (mean) filtering. More...
|
| |
| kernel void | boxFilter (global float *in, global float *out, local float *data, int radius) |
| | Performs box (mean) filtering. More...
|
| |
Kernels for performing Box Filtering.
- Author
- Nick Lamprianidis
- Version
- 1.0
- Date
- 2015
- Copyright
- The MIT License (MIT)
- Copyright (c) 2015 Nick Lamprianidis
- Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
- The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
| #define NUM_BF_STORING_WORK_ITEMS 16 * 16 / 4 |
| kernel void boxFilter |
( |
global float * |
in, |
|
|
global float * |
out, |
|
|
local float * |
data, |
|
|
int |
radius |
|
) |
| |
Performs box (mean) filtering.
The work complexity is O(n) in the window size.
- Note
- Both dimensions of the image have to be multiples of the work-group dimensions, respectively. This specification could be overcome by extending the buffers [clEnqueue(Read/Write)BufferRect] and including bounds checking within the kernel. These cases won't be handled.
-
The x dimension of the global workspace, \( gXdim \), should be equal to the number of columns,
N, in the image. That is, \( \ gXdim = N \). The y dimension of the global workspace, \( gYdim \), should be equal to the number of rows, M, in the image. That is, \( \ gYdim = M \). The local workspace should be 16x16. That is, \( \ lXdim = lYdim = 16 \).
-
Vector reads are avoided since the required alignments complicate memory address calculations.
- Parameters
-
| [in] | in | input array of float elements. |
| [out] | out | output (blurred) array of float elements. |
| [in] | data | local buffer. Its size should be 1 float element for each work-item in a work-group and each halo pixel. That is, \( (lXdim+2*radius)*(lYdim+2*radius)*sizeof\ (float) \). |
| [in] | radius | radius of the square filter window. |
| kernel void boxFilterSAT |
( |
global float * |
sat, |
|
|
global float * |
out, |
|
|
int |
radius |
|
) |
| |
Performs box (mean) filtering.
Accepts a SAT array, \( sat_{M \times N} \), performs the filtering, and outputs the result, \( out_{M \times N} \). The work complexity is O(1) in the window size.
- Note
- The x dimension of the global workspace, \( gXdim \), should be equal to the number of columns,
N, in the image. That is, \( \ gXdim = N \). The y dimension of the global workspace, \( gYdim \), should be equal to the number of rows, M, in the image. That is, \( \ gYdim = M \). The local workspace is irrelevant.
- Parameters
-
| [in] | sat | input array of float elements. |
| [out] | out | output (blurred) array of float elements. |
| [in] | radius | radius of the square filter window. |
| kernel void boxFilterSAT_Tr |
( |
global float * |
sat, |
|
|
global float4 * |
out, |
|
|
local float * |
data, |
|
|
int |
radius, |
|
|
float |
scaling |
|
) |
| |
Performs box (mean) filtering.
Accepts a transposed SAT array, \( sat_{N \times M} \), performs the filtering, and outputs the result, \( out_{M \times N} \). The work complexity is O(1) in the window size.
- Note
- Both dimensions of the image have to be multiples of the work-group dimensions, respectively. This specification could be overcome by extending the buffers [clEnqueue(Read/Write)BufferRect] and including bounds checking within the kernel. These cases won't be handled.
-
The x dimension of the global workspace, \( gXdim \), should be equal to the number of columns,
M, in the SAT array. That is, \( \ gXdim = M \). The y dimension of the global workspace, \( gYdim \), should be equal to the number of rows, N, in the SAT array. That is, \( \ gYdim = N \). The local workspace should be 16x16. That is, \( \ lXdim = lYdim = 16 \).
-
Each work-item filters one pixel, and then the first 64 work-items in each work-group store a transposed 4 pixel block in global memory.
- Parameters
-
| [in] | sat | input array of float elements. |
| [out] | out | output (blurred) array of float elements. |
| [in] | data | local buffer. Its size should be 1 float element for each work-item in a work-group. That is, \( lXdim*lYdim*sizeof\ (float) \). |
| [in] | radius | radius of the square filter window. |
| [in] | scaling | factor by which to scale the array elements after processing. |