tlseparation.classification package¶

Submodules¶

tlseparation.classification.classes_reference module¶

class tlseparation.classification.classes_reference.DefaultClass[source]¶: Defines a default reference class to be used in classification of tree point clouds.

tlseparation.classification.classify_wood module¶

tlseparation.classification.classify_wood.reference_classification(point_cloud, knn_list, n_classes=4, prob_threshold=0.95)[source]¶

Classifies wood material points from a point cloud. This function uses wlseparate_ref_voting to perform the basic classification and then apply class_filter to filter out potentially misclassified wood points.

Parameters:

point_cloud: numpy.ndarray: 2D (n x 3) array containing n points in 3D space (x, y, z).
knn_list: list: List of knn values to be used iteratively in the voting separation.
n_classes: int: Number of intermediate classes. Minimum classes should be 3, but default value is set to 4 in order to accommodate for noise/outliers classes.
prob_threshold: float: Classification probability threshold to filter classes. This aims to avoid selecting points that are not confidently enough assigned to any given class. Default is 0.95.

Returns:

wood_points: numpy.ndarray: 2D (nw x 3) array containing n wood points in 3D space (x, y, z).

tlseparation.classification.classify_wood.threshold_classification(point_cloud, knn, n_classes=3, prob_threshold=0.95)[source]¶

Classifies wood material points from a point cloud. This function uses wlseparate_abs to perform the basic classification and then apply class_filter to filter out potentially misclassified wood points.

Parameters:

point_cloud : numpy.ndarray: 2D (n x 3) array containing n points in 3D space (x, y, z).
knn : int: Number of neighbors to select around each point. Used to describe local point arrangement.
n_classes: int: Number of intermediate classes. Default is 3.
prob_threshold: float: Classification probability threshold to filter classes. This aims to avoid selecting points that are not confidently enough assigned to any given class. Default is 0.95.

Returns:

wood_points: numpy.ndarray: 2D (nw x 3) array containing n wood points in 3D space (x, y, z).

tlseparation.classification.gmm module¶

tlseparation.classification.gmm.class_select_abs(classes, cm, nbrs_idx, feature=5, threshold=0.5)[source]¶

Select from GMM classification results which classes are wood and which are leaf based on a absolute value threshold from a single feature in the parameter space.

Parameters:

classes : list or array: Classes labels for each observation from the input variables.
cm : array: N-dimensional array (c x n) of each class (c) parameter space mean valuess (n).
nbrs_idx : array: Nearest Neighbors indices relative to every point of the array that originated the classes labels.
feature : int: Column index of the feature to use as constraint.
threshold : float: Threshold value to mask classes. All classes with means >= threshold are masked as true.

Returns:

mask : list: List of booleans where True represents wood points and False represents leaf points.

tlseparation.classification.gmm.class_select_ref(classes, cm, classes_ref)[source]¶

Selects from the classification results which classes are wood and which are leaf.

Parameters:	classes : list List of classes labels for each observation from the input variables. cm : array N-dimensional array (c x n) of each class (c) parameter space mean valuess (n). classes_ref : array Reference classes values.
Returns:	mask : array List of booleans where True represents wood points and False represents leaf points.

tlseparation.classification.gmm.classify(variables, n_classes)[source]¶

Function to perform the classification of a dataset using sklearn’s Gaussian Mixture Models with Expectation Maximization.

Parameters:	variables : array N-dimensional array (m x n) containing a set of parameters (n) over a set of observations (m). n_classes : int Number of classes to assign the input variables.
Returns:	classes : list List of classes labels for each observation from the input variables. means : array N-dimensional array (c x n) of each class (c) parameter space means (n). probability : array Probability of samples belonging to every class in the classification. Sum of sample-wise probability should be 1.

tlseparation.classification.path_detection module¶

tlseparation.classification.path_detection.detect_main_pathways(point_cloud, k_retrace, knn, nbrs_threshold, verbose=False, max_iter=100)[source]¶

Detects the main pathways of an unordered 3D point cloud. Set as true all points detected as part of all detected pathways that down to the base of the graph.

Parameters:	point_cloud : array Three-dimensional point cloud of a single tree to perform the wood-leaf separation. This should be a n-dimensional array (m x n) containing a set of coordinates (n) over a set of points (m). k_retrace : int Number of steps in the graph to retrace back to graph’s base. Every node in graph will be moved k_retrace steps from the extremities towards to base. knn : int Number of neighbors to fill gaps in detected paths. The larger the better. A large knn will increase memory usage. Recommended value between 50 and 150. nbrs_threshold : float Maximum distance to valid neighboring points used to fill gaps in detected paths. verbose: bool Option to set verbose on/off.
Returns:	path_mask : array Boolean mask where ‘True’ represents points detected as part of the main pathways and ‘False’ represents points not part of the pathways.
Raises:	AssertionError: point_cloud has the wrong shape or number of dimensions.

tlseparation.classification.path_detection.get_base(point_cloud, base_height)[source]¶

Get the base of a point cloud based on a certain height from the bottom.

Parameters:	point_cloud : array Three-dimensional point cloud of a single tree to perform the wood-leaf separation. This should be a n-dimensional array (m x n) containing a set of coordinates (n) over a set of points (m). base_height : float Height of the base slice to mask.
Returns:	mask : array Base slice masked as True.

tlseparation.classification.path_detection.path_detect_frequency(point_cloud, downsample_size, frequency_threshold)[source]¶

Detects points from major paths in a graph generated from a point cloud. The detection is performed by comparing the frequency of all paths that each node is present. Nodes with frequency larger than threshold are selected as detected. In order to fill pathways regions with low nodes density, neighboring points within downsampling_size * 1.5 distance are also set as detected.

Parameters:	point_cloud : numpy.ndarray 2D (n x 3) array containing n points in 3D space (x, y, z). downsample_size : float Distance threshold used to group (downsample) the input point cloud. Simplificaton of the cloud by downsampling, improves the results and processing times. frequency_threshold : float Minimum path frequency for a node to be selected as part of major pathways.
Returns:	path_points: numpy.ndarray 2D (np x 3) array containing n points in 3D space (x, y, z) that belongs to major pathways in the point cloud.

tlseparation.classification.path_detection.voxel_path_detection(point_cloud, voxel_size, k_retrace, knn, nbrs_threshold, verbose=False)[source]¶

Applies detect_main_pathways but with a voxelization option to speed up processing.

Parameters:	point_cloud : array Three-dimensional point cloud of a single tree to perform the wood-leaf separation. This should be a n-dimensional array (m x n) containing a set of coordinates (n) over a set of points (m). voxel_size: float Voxel dimensions’ size. k_retrace : int Number of steps in the graph to retrace back to graph’s base. Every node in graph will be moved k_retrace steps from the extremities towards to base. knn : int Number of neighbors to fill gaps in detected paths. The larger the better. A large knn will increase memory usage. Recommended value between 50 and 150. nbrs_threshold : float Maximum distance to valid neighboring points used to fill gaps in detected paths. verbose: bool Option to set verbose on/off.
Returns:	path_mask : array Boolean mask where ‘True’ represents points detected as part of the main pathways and ‘False’ represents points not part of the pathways.
Raises:	AssertionError: point_cloud has the wrong shape or number of dimensions.

tlseparation.classification.point_features module¶

tlseparation.classification.point_features.calc_features(e)[source]¶

Calculates the geometric features using a set of eigenvalues, based on Ma et al. [1] and Wang et al. [2].

Parameters:	e : array N-dimensional array (m x 3) containing sets of 3 eigenvalues per row (m).
Returns:	features : array N-dimensional array (m x 6) containing the calculated geometric features from ‘e’.

References

[1]	Ma et al., 2015. Improved Salient Feature-Based Approach for Automatically Separating Photosynthetic and Nonphotosynthetic Components Within Terrestrial Lidar Point Cloud Data of Forest Canopies.

[2]	Wang et al., 2015. A Multiscale and Hierarchical Feature Extraction Method for Terrestrial Laser Scanning Point Cloud Classification.

tlseparation.classification.point_features.curvature(arr, nbrs_idx)[source]¶

Calculates pointwise curvature of a point cloud.

Parameters:

arr : array: Three-dimensional (m x n) array of a point cloud, where the coordinates are represented in the columns (n) and the points are represented in the rows (m).
nbr_idx : array: N-dimensional array of indices from a nearest neighbors search of the point cloud in ‘arr’, where the rows (m) represents the points in ‘arr’ and the columns represents the indices of the nearest neighbors from ‘arr’.

Returns:

c : numpy.ndarray: 1D (m x 1) array containing the curvature of each point in ‘arr’.

tlseparation.classification.point_features.knn_evals(arr_stack)[source]¶

Calculates eigenvalues of a stack of arrays.

Parameters:	arr_stack : array N-dimensional array (l x m x n) containing a stack of data, where the rows (m) represents the points coordinates, the columns (n) represents the axis coordinates and the layer (l) represents the stacks of points.
Returns:	evals : array N-dimensional array (l x n) of eigenvalues calculated from ‘arr_stack’. The rows (l) represents the stack layers of points in ‘arr_stack’ and the columns (n) represent the parameters in ‘arr_stack’.

tlseparation.classification.point_features.knn_features(arr, nbr_idx, block_size=200000)[source]¶

Calculates geometric descriptors: salient features and tensor features from an array and an indexing with fixed numbers of neighbors.

Parameters:

arr : array: Three-dimensional (m x n) array of a point cloud, where the coordinates are represented in the columns (n) and the points are represented in the rows (m).
nbr_idx : array: N-dimensional array of indices from a nearest neighbors search of the point cloud in ‘arr’, where the rows (m) represents the points in ‘arr’ and the columns represents the indices of the nearest neighbors from ‘arr’.

Returns:

features : array: N-dimensional array (m x 6) of the calculated geometric descriptors. Where the rows (m) represent the points from ‘arr’ and the columns represents the features.

tlseparation.classification.point_features.svd_evals(arr)[source]¶

Calculates eigenvalues of an array using SVD.

Parameters:	arr : array nxm numpy.ndarray where n is the number of samples and m is the number of dimensions.
Returns:	evals : array 1xm numpy.ndarray containing the calculated eigenvalues in decrescent order.

tlseparation.classification.point_features.vectorized_app(arr_stack)[source]¶

Function to calculate the covariance of a stack of arrays. This function uses einstein summation to make the covariance calculation more efficient. Based on a reply from the user Divakar [3] at stackoverflow.

Parameters:	arr_stack : array N-dimensional array (l x m x n) containing a stack of data, where the rows (m) represents the points coordinates, the columns (n) represents the axis coordinates and the layer (l) represents the stacks of points.
Returns:	cov : array N-dimensional array (l x n x n) of covariance values calculated from ‘arr_stack’. Each layer (l) contains a (n x n) covariance matrix calculated from the layers (l) in ‘arr_stack’.

References

[3]	Divakar, 2016. http://stackoverflow.com/questions/35756952/quickly-compute-eigenvectors-for-each-element-of-an-array-in-python.

tlseparation.classification.wlseparation module¶

tlseparation.classification.wlseparation.fill_class(arr1, arr2, noclass, k)[source]¶

Assigns noclass entries to either arr1 or arr2, depending on neighborhood majority analisys.

Parameters:	arr1 : array Point coordinates for entries of the first class. arr2 : array Point coordinates for entries of the second class. noclass : array Point coordinates for noclass entries. k : int Number of neighbors to use in the neighborhood majority analysis.
Returns:	arr1 : array Point coordinates for entries of the first class. arr2 : array Point coordinates for entries of the second class.

tlseparation.classification.wlseparation.wlseparate_abs(arr, knn, knn_downsample=1, n_classes=3)[source]¶

Classifies a point cloud (arr) into three main classes, wood, leaf and noclass.

The final class selection is based on the absolute value of the last geometric feature (see point_features module). Points will be only classified as wood or leaf if their classification probability is higher than prob_threshold. Otherwise, points are assigned to noclass.

Class selection will mask points with feature value larger than a given threshold as wood and the remaining points as leaf.

Parameters:

arr : array: Three-dimensional point cloud of a single tree to perform the wood-leaf separation. This should be a n-dimensional array (m x n) containing a set of coordinates (n) over a set of points (m).
knn : int: Number of nearest neighbors to search to constitue the local subset of points around each point in ‘arr’.
knn_downsample : float: Downsample factor (0, 1) for the knn parameter. If less than 1, a sample of size (knn * knn_downsample) will be selected from the nearest neighbors indices. This option aims to maintain the spatial representation of the local subsets of points, but reducing overhead in memory and processing time.
n_classes : int: Number of classes to use in the Gaussian Mixture Classification.

Returns:

class_indices : dict: Dictionary containing indices for wood and leaf classes.
class_probability : dict: Dictionary containing probabilities for wood and leaf classes.

tlseparation.classification.wlseparation.wlseparate_ref_voting(arr, knn_lst, class_file, n_classes=3)[source]¶

Classifies a point cloud (arr) into two main classes, wood and leaf. Altough this function does not output a noclass category, it still filters out results based on classification confidence interval in the voting process (if lower than prob_threshold, then voting is not used for current point and knn value).

The final class selection is based a voting scheme applied to a similar approach of wlseparate_ref. In this case, the function iterates over a series of knn values and apply the reference distance criteria to select wood and leaf classes.

Each knn class result is accumulated in a list and in the end a voting is applied. For each point, if the number of times it was classified as wood is larger than threhsold, the final class is set to wood. Otherwise it is set as leaf.

Class selection will mask points according to their class mean distance to reference classes. The closes reference class gets assignes to each intermediate class.

Parameters:

arr : array: Three-dimensional point cloud of a single tree to perform the wood-leaf separation. This should be a n-dimensional array (m x n) containing a set of coordinates (n) over a set of points (m).
knn_lst : list: List of knn values to use in the search to constitue local subsets of points around each point in ‘arr’. It can be a single knn value, as long as it has list data type.
class_file : pandas dataframe or str: Dataframe or path to reference classes file.
n_classes : int: Number of classes to use in the Gaussian Mixture Classification.

Returns:

class_dict : dict: Dictionary containing indices for all classes in class_ref. Classes are labeled according to classes names in class_file.
count_dict : dict: Dictionary containin votes count for all classes in class_ref. Classes are labeled according to classes names in class_file.
prob_dict : dict: Dictionary containing probabilities for all classes in class_ref. Classes are labeled according to classes names in class_file.

tlseparation.classification package¶

Submodules¶

tlseparation.classification.classes_reference module¶

tlseparation.classification.classify_wood module¶

tlseparation.classification.gmm module¶

tlseparation.classification.path_detection module¶

tlseparation.classification.point_features module¶

tlseparation.classification.wlseparation module¶

Module contents¶