Summary:
CUDA implementation of farthest point sampling algorithm.
## Visual comparison
Compared to random sampling, farthest point sampling gives better coverage of the shape.
{F658631262}
## Reduction
Parallelized block reduction to find the max value at each iteration happens as follows:
1. First split the points into two equal sized parts (e.g. for a list with 8 values):
`[20, 27, 6, 8 | 11, 10, 2, 33]`
2. Use half of the thread (4 threads) to compare pairs of elements from each half (e.g elements [0, 4], [1, 5] etc) and store the result in the first half of the list:
`[20, 27, 6, 33 | 11, 10, 2, 33]`
Now we no longer care about the second part but again divide the first part into two
`[20, 27 | 6, 33| -, -, -, -]`
Now we can use 2 threads to compare the 4 elements
4. Finally we have gotten down to a single pair
`[20 | 33 | -, - | -, -, -, -]`
Use 1 thread to compare the remaining two elements
5. The max will now be at thread id = 0
`[33 | - | -, - | -, -, -, -]`
The reduction will give the farthest point for the selected batch index at this iteration.
Reviewed By: bottler, jcjohnson
Differential Revision: D30401803
fbshipit-source-id: 525bd5ae27c4b13b501812cfe62306bb003827d2
Summary:
This is a naive python implementation of the iterative farthest point sampling algorithm along with associated simple tests. The C++/CUDA implementations will follow in subsequent diffs.
The algorithm is used to subsample a pointcloud with better coverage of the space of the pointcloud.
The function has not been added to `__init__.py`. I will add this after the full C++/CUDA implementations.
Reviewed By: jcjohnson
Differential Revision: D30285716
fbshipit-source-id: 33f4181041fc652776406bcfd67800a6f0c3dd58