Kernels for performing Box Filtering. More...

Macros
#define	NUM_BF_STORING_WORK_ITEMS 16 * 16 / 4

Functions
kernel void	boxFilterSAT (global float sat, global float out, int radius)
	Performs box (mean) filtering. More...

kernel void	boxFilterSAT_Tr (global float sat, global float4 out, local float *data, int radius, float scaling)
	Performs box (mean) filtering. More...

kernel void	boxFilter (global float in, global float out, local float *data, int radius)
	Performs box (mean) filtering. More...

Detailed Description

Kernels for performing Box Filtering.

Author: Nick Lamprianidis

Version: 1.0

Date: 2015

Copyright: The MIT License (MIT)

: Copyright (c) 2015 Nick Lamprianidis

: Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

: THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Macro Definition Documentation

#define NUM_BF_STORING_WORK_ITEMS 16 * 16 / 4

Function Documentation

kernel void boxFilter	(	global float *	in,
		global float *	out,
		local float *	data,
		int	radius
	)

Performs box (mean) filtering.

The work complexity is O(n) in the window size.

Note: Both dimensions of the image have to be multiples of the work-group dimensions, respectively. This specification could be overcome by extending the buffers [clEnqueue(Read/Write)BufferRect] and including bounds checking within the kernel. These cases won't be handled.; The x dimension of the global workspace, \( gXdim \), should be equal to the number of columns, N, in the image. That is, \( \ gXdim = N \). The y dimension of the global workspace, \( gYdim \), should be equal to the number of rows, M, in the image. That is, \( \ gYdim = M \). The local workspace should be 16x16. That is, \( \ lXdim = lYdim = 16 \).; Vector reads are avoided since the required alignments complicate memory address calculations.

Parameters

[in]	in	input array of `float` elements.
[out]	out	output (blurred) array of `float` elements.
[in]	data	local buffer. Its size should be `1 float` element for each work-item in a work-group and each halo pixel. That is, \( (lXdim+2radius)(lYdim+2radius)sizeof\ (float) \).
[in]	radius	radius of the square filter window.

kernel void boxFilterSAT	(	global float *	sat,
		global float *	out,
		int	radius
	)

Performs box (mean) filtering.

Accepts a SAT array, \( sat_{M \times N} \), performs the filtering, and outputs the result, \( out_{M \times N} \). The work complexity is O(1) in the window size.

Note: The x dimension of the global workspace, \( gXdim \), should be equal to the number of columns, N, in the image. That is, \( \ gXdim = N \). The y dimension of the global workspace, \( gYdim \), should be equal to the number of rows, M, in the image. That is, \( \ gYdim = M \). The local workspace is irrelevant.

Parameters

[in]	sat	input array of `float` elements.
[out]	out	output (blurred) array of `float` elements.
[in]	radius	radius of the square filter window.

kernel void boxFilterSAT_Tr	(	global float *	sat,
		global float4 *	out,
		local float *	data,
		int	radius,
		float	scaling
	)

Performs box (mean) filtering.

Accepts a transposed SAT array, \( sat_{N \times M} \), performs the filtering, and outputs the result, \( out_{M \times N} \). The work complexity is O(1) in the window size.

Note: Both dimensions of the image have to be multiples of the work-group dimensions, respectively. This specification could be overcome by extending the buffers [clEnqueue(Read/Write)BufferRect] and including bounds checking within the kernel. These cases won't be handled.; The x dimension of the global workspace, \( gXdim \), should be equal to the number of columns, M, in the SAT array. That is, \( \ gXdim = M \). The y dimension of the global workspace, \( gYdim \), should be equal to the number of rows, N, in the SAT array. That is, \( \ gYdim = N \). The local workspace should be 16x16. That is, \( \ lXdim = lYdim = 16 \).; Each work-item filters one pixel, and then the first 64 work-items in each work-group store a transposed 4 pixel block in global memory.

Parameters

[in]	sat	input array of `float` elements.
[out]	out	output (blurred) array of `float` elements.
[in]	data	local buffer. Its size should be `1 float` element for each work-item in a work-group. That is, \( lXdimlYdimsizeof\ (float) \).
[in]	radius	radius of the square filter window.
[in]	scaling	factor by which to scale the array elements after processing.

Macros

Functions

Detailed Description

Macro Definition Documentation

Function Documentation