AABB tree: enrich performance section with a summary (general comments and advices about how to put the tree at work with good performances). this is not exhaustive nor conclusive of course but I believe a documentation must also tell the obvious.

This commit is contained in:
Pierre Alliez 2009-07-05 20:45:08 +00:00
parent a958ed6927
commit 656138e3ae
2 changed files with 72 additions and 47 deletions

View File

@ -1,25 +1,25 @@
#ifndef AABB_DEMO_TYPES_H
#define AABB_DEMO_TYPES_H
#ifndef AABB_DEMO_TYPES_H
#define AABB_DEMO_TYPES_H
#include <CGAL/basic.h>
//#include <CGAL/Exact_predicates_inexact_constructions_kernel.h>
//#include <CGAL/Cartesian.h>
#include <CGAL/Simple_cartesian.h>
typedef CGAL::Simple_cartesian<double> Kernel; // fastest in experiments
//typedef CGAL::Cartesian<double> Kernel;
//typedef CGAL::Exact_predicates_inexact_constructions_kernel Kernel;
typedef Kernel::FT FT;
typedef Kernel::Ray_3 Ray;
typedef Kernel::Line_3 Line;
typedef Kernel::Point_3 Point;
typedef Kernel::Plane_3 Plane;
typedef Kernel::Vector_3 Vector;
typedef Kernel::Segment_3 Segment;
typedef Kernel::Triangle_3 Triangle;
#include <CGAL/Polyhedron_3.h>
typedef CGAL::Polyhedron_3<Kernel> Polyhedron;
#endif // AABB_DEMO_TYPES_H
//#include <CGAL/Exact_predicates_inexact_constructions_kernel.h>
//#include <CGAL/Cartesian.h>
#include <CGAL/Simple_cartesian.h>
typedef CGAL::Simple_cartesian<double> Kernel; // fastest in experiments
//typedef CGAL::Cartesian<double> Kernel;
//typedef CGAL::Exact_predicates_inexact_constructions_kernel Kernel;
typedef Kernel::FT FT;
typedef Kernel::Ray_3 Ray;
typedef Kernel::Line_3 Line;
typedef Kernel::Point_3 Point;
typedef Kernel::Plane_3 Plane;
typedef Kernel::Vector_3 Vector;
typedef Kernel::Segment_3 Segment;
typedef Kernel::Triangle_3 Triangle;
#include <CGAL/Polyhedron_3.h>
typedef CGAL::Polyhedron_3<Kernel> Polyhedron;
#endif // AABB_DEMO_TYPES_H

View File

@ -1,11 +1,13 @@
\section{Performances}
\label{AABB_tree_section_performances}
We provide some performance numbers for the case where the AABB tree contains a set of polyhedron triangle facets. We measure both the tree construction time and the number of queries per second for a variety of intersection and distance queries. The machine used is a PC running Windows XP64 with an Intel CPU Core2 Extreme clocked at 3.06 GHz with 4GB of RAM. The kernel used is \ccc{Simple_cartesian<double>} (the fastest in our experiments). The program has been compiled with Visual C++ 2005 compiler with the O2 option (maximize speed).
We provide some performance numbers for the case where the AABB tree contains a set of polyhedron triangle facets. We measure the tree construction time, the memory occupancy, and the number of queries per second for a variety of intersection and distance queries. The machine used is a PC running Windows XP64 with an Intel CPU Core2 Extreme clocked at 3.06 GHz with 4GB of RAM. BY default the kernel used is \ccc{Simple_cartesian<double>} (the fastest in our experiments). The program has been compiled with Visual C++ 2005 compiler with the O2 option (maximize speed).
\subsection{Intersections}
\subsection{Tree Construction}
The surface triangle mesh chosen for benchmarking intersections is the knot model (14,400 triangles) available in the demo data folder. We measure the tree construction time for this model as well as for three denser versions subdivided through the Loop subdivision scheme which increases the number of triangles by a factor of four.
The surface triangle mesh chosen for benchmarking the tree construction is the knot model (14,400 triangles) available in the demo data folder. We measure the tree construction time for this model as well as for three denser versions subdivided through the Loop subdivision scheme which increases the number of triangles by a factor of four.
% TODO: add column with construction + internal KD-tree construction
\begin{tabular}{|l|c|}
\hline
@ -18,9 +20,33 @@ The surface triangle mesh chosen for benchmarking intersections is the knot mode
\hline
\end{tabular}
\subsection{Memory}
The following curve plots the AABB tree memory consumption (without constructing the internal KD-tree) against the number of triangles of a polyhedron triangle surface mesh, when using the polyhedron triangle primitive. As expected the memory grows linearly and peaks to 128Mbytes for 2.3M triangles in the example shown. The AABB tree occupies approximately 60 bytes per primitive. The memory consumption goes up to almost 100 bytes per primitive when constructing the internal KD-tree with one reference point per primitive (the default mode when calling the function \ccc{tree.accelerate_distance_queries()}). For large models we thus recommend to specify a lower number of reference point (evenly distributed) to construct the internal KD-tree through the same function which takes an iterator range as input.
% memory
\begin{center}
\label{fig:AABB-tree-memory}
\begin{ccTexOnly}
\includegraphics[width=1.0\textwidth]{AABB_tree/figs/memory}
\end{ccTexOnly}
\begin{ccHtmlOnly}
<img width="99%" border=0 src="./figs/memory.png"><P>
\end{ccHtmlOnly}
\begin{figure}[h]
\caption{Memory consumption in Bytes against number of triangles, here
ranging from 100 to 2.3M triangles.}
\end{figure}
\end{center}
\subsection{Intersections}
The following table measures the number of intersection queries per second on the 14,400 triangle version of the knot mesh model for ray, line, segment and plane queries. Each ray query is generated by choosing a random source point within the mesh bounding box and a random vector. A line or segment query is generated by choosing two random points inside the bounding box. A plane query is generated by picking a random point inside the bounding box and a random normal vector. Note that a plane query generally intersects many triangles of the input surface mesh. This explains the low performance numbers for the intersection functions which enumerate all intersections.
\begin{tabular}{|l|r|r|r|r|}
\label{table:AABB-tree-intersections}
\hline
Function & Segment & Ray & Line & Plane \\
\hline
@ -54,7 +80,6 @@ Curve \ref{fig:AABB-tree-bench} plots the number of queries per second (here the
The following table measures the number of \ccc{all_intersections() queries per second on the 14,400 triangle version of the knot mesh model for random segment queries. The \ccc{Simple_cartesian} kernel is substantially faster than the \ccc{Cartesian} kernel.
\begin{tabular}{|l|c|}
\hline
Kernel & #queries/s (all\_intersections() with segment queries)\\
@ -70,9 +95,9 @@ The following table measures the number of \ccc{all_intersections() queries per
\subsection{Distances}
The surface triangle mesh chosen for benchmarking distances is again the knot model in four increasing resolutions obtained through Loop subdivision. In the following table we first measure the tree construction time which includes the construction of the internal KD-tree data structure to accelerate the distance queries (note how the internal KD-tree construction is negligible compared to the AABB tree construction, while it brings an acceleration of up to one order of magnitude). We then measure the number of queries per second for the three types distance queries (\ccc{closest_point}, \ccc{squared_distance} and \ccc{closest_point_and_primitive}) from point queries randomly chosen inside the bounding box.
The surface triangle mesh chosen for benchmarking distances is again the knot model in four increasing resolutions obtained through Loop subdivision. In the following table we first measure the tree construction time which includes the construction of the internal KD-tree data structure to accelerate the distance queries (note how the internal KD-tree construction time is negligible compared to the AABB tree construction time, while it brings an acceleration of up to one order of magnitude). We then measure the number of queries per second for the three types distance queries (\ccc{closest_point}, \ccc{squared_distance} and \ccc{closest_point_and_primitive}) from point queries randomly chosen inside the bounding box.
% TODO: redo with CGAL KD-tree issue fixed.
% TODO: check CGAL KD-tree issue
\begin{tabular}{|l|c|c|c|c|}
\hline
@ -86,23 +111,23 @@ The surface triangle mesh chosen for benchmarking distances is again the knot mo
\end{tabular}
\subsection{Memory}
% TODO: against kernels
The following curve plots the AABB tree memory consumption (without constructing the internal KD-tree) against the number of triangles of a polyhedron triangle surface mesh, when using the polyhedron triangle primitive. As expected the memory grows linearly and peaks to 128Mbytes for 2.3M triangles in the example shown. The AABB tree occupies approximately 60 bytes per primitive.
% memory
\begin{center}
\label{fig:AABB-tree-memory}
\begin{ccTexOnly}
\includegraphics[width=1.0\textwidth]{AABB_tree/figs/memory}
\end{ccTexOnly}
\begin{ccHtmlOnly}
<img width="99%" border=0 src="./figs/memory.png"><P>
\end{ccHtmlOnly}
\begin{figure}[h]
\caption{Memory consumption in Bytes against number of triangles, here
ranging from 100 to 2.3M triangles.}
\end{figure}
\end{center}
\subsection{Summary}
The experiments described above are neither exhaustive nor conclusive as we have chosen one specific case where the input primitive is a triangle facet of a surface polyhedron. Nevertheless we provide the reader with some general observations and advices about how to put the AABB tree to use with satisfactory performances.
While the tree construction times and memory occupancy do not fluctuate much in our experiments depending on the input surface triangle mesh, the performance expressed in number of queries varies greatly depending on a complex combination of criteria: type of kernel, number of input primitives, distribution of primitives in space, type of function queried, type of query, and location of query in space.
The type of CGAL kernel turns out to dominate the final execution times, the maximum performances being obtained with the simple Cartesian kernel templated with the double precision number type. In applications where the intersection and distance execution times are crucial it is possible to use this kernel for the AABB tree in combination with a more robust kernel for the main data structure.
Although the number of input primitives plays an obvious role in the final performance, their distribution in space is also very important in order to obtain a well-balanced AABB tree. Ideally the primitives must be evenly distributed in space and the long primitives spanning the bounding box of the tree root node must be avoided as much as possible. In the latter cases it is often beneficial to split these long primitives into smaller ones before constructing the tree, e.g., through recursive longest edge bisection for triangle surface meshes.
As Table \ref{table:AABB-tree-intersections} depicts the type of function queried plays another important role. Obviously the ``exhaustive'' functions, which list all intersections, are slower than the ones stopping after the first intersection. Within each of these functions the ones which call only intersection tests (do\_intersect(), number\_of\_intersected\_primitives(), all\_intersected\_primitives()) are faster than the ones which explicitly construct the intersections (any\_intersection() and all\_intersections()).
The type of query (e.g., line, ray, segment or plane used above) plays another role, strongly correlated with the type of function (exhaustive or not, and whether or not it constructs the intersections). When all intersection constructions are needed, the final execution times highly depend on the complexity of the general intersection object (e.g., a plane query generally intersects a surface triangle mesh into many segments while a segment query generally intersects a surface triangle mesh into few points).
Finally, the location of the query in space also plays an obvious role in the performances, especially for the distance queries. Assuming the internal KD-tree constructed through the function \ccc{tree.accelerate_distance_queries()}, it is preferable to specify a query point already close to the surface triangle mesh so that the query traverses only few AABBs of the tree. For a large number of primitive data (greater than 2M faces in our experiments) however we noticed that it is not necessary (and sometimes even slower) to use all reference points when constructing the KD-tree. In these cases we recommend to specify trough the function \ccc{tree.accelerate_distance_queries(begin,end)} only 100K reference points evenly distributed over the input primitives.
The memory consumption goes up to almost 100 bytes per primitive when constructing the internal KD-tree with one reference point per primitive (the default mode when calling the function \ccc{tree.accelerate_distance_queries()}). For large models we thus recommend to specify a lower number of reference point to construct the internal KD-tree through the same function which takes an iterator range as input.