% +--- CHAPTER --- \begin{comment} ./texfix.py --fpaths chapter6-conclusion.tex --outline --asmarkdown --numlines=99 -w fixtex --fpaths chapter6-conclusion.tex --outline --asmarkdown --numlines=999 --shortcite \end{comment} \chapter{CONCLUSION}\label{chap:conclusion} In this \thesis{} we have addressed the problem of identifying individual animals from images. We have demonstrated that our approach is effective for identifying plains zebras, Grévy's zebras, Masai giraffes, and humpback whales. Our approach consists of three main components: (1) the ranking algorithm from \cref{chap:ranking} that uses a bounding box annotation around an animal to search a labeled database of annotations for likely matches, (2) the classification algorithm from \cref{chap:pairclf} that probabilistically verifies if a pair of annotations is positive, negative, or incomparable, and (3) the graph framework from \cref{chap:graphid} that harnesses the previous algorithms in a principled way to dynamically determine the identity of all animals in a dataset. Each of these algorithms was designed to build on the previous one(s), improving the overall accuracy and efficiency of the counting process. In \cref{sec:graphexpt} we demonstrated that this was indeed the case. By combining these algorithms we have made several meaningful contributions to the problem of animal identification. In \cref{sec:introgzc} we discussed the Great Zebra Count (\GZC{}), where the ranking algorithm was used in combination with the effort of citizen scientists to provide an estimate of the number of plains zebras and Masai giraffes in Nairobi National Park. In \cref{sec:rankexpt} we investigated several parameters and factors that can impact the performance of the ranking algorithm. We discovered that having multiple photos of each individual significantly improves the accuracy of the ranking algorithm and we designed a novel name scoring mechanism with this in mind. In \cref{sec:pairexpt} we demonstrated that a classification algorithm can be used to improve the separation of positive results from negative and incomparable results in a ranked list. In \cref{sec:graphexpt} we simulated the \GZC{} and demonstrated that our improvements to the ranking algorithm --- made by the classification and graph algorithm --- enable us to perform identification using less than $25\percent$ of the number of manual reviews required by the original event. \section{DISCUSSION}\label{sec:discuss} The research that resulted in this \thesis{} began in $2010$ and was completed in $2017$. During that time, many significant developments were made in the fields of computer vision and machine learning, most notably the explosion of deep learning~\cite{lecun_deep_2015}. While some steps in our approach (\eg{} the foregroundness weights) do make use of deep convolutional neural networks (DCNNs), most do not. In some sense this is an advantage because the algorithms can be applied to different species without any need for pre-training, but this also means they do not obtain the level of accuracy shown to be achievable by these networks. Yet, the contributions of this \thesis{} are still relevant and complementary to DCNNs. This is trivially true in the case of the ranking and classification algorithms, in part due to the aforementioned reasons. However, the contribution of the graph algorithm is relevant, even in the era of deep learning. %\section{Discussion of the graph algorithm} The graph identification algorithm models the abstract constraints of the identification problem and provides a framework that can efficiently harness the power of any ranking or verification algorithm, whether it be deep or shallow. The framework dynamically manages the relationships between annotations. In most cases this means deciding if two annotations are the same (positive) or different (negative), but this also means handling cases like when the annotations are incomparable or when there is some other interesting connection between two annotations like scenery matches and photobombs. As new relationships are added, errors are discovered and corrected, and the identifications are updated. The framework also provides a means of prioritizing which edges need to be reviewed based on (1) the underlying computer vision algorithms, (2) the edge-augmentation needed to ensure minimum redundancy, and (3) the minimum cut needed to correct an error and split an inconsistent individual. Edge prioritization works in conjunction with a convergence criteria that determines when identification has been completed. A signal is emitted whenever manual interaction is needed, and the algorithm continues after the user returns with a response. The only time a user interacts with the algorithm after it begins is to label an edge as positive, negative, or incomparable. All other decisions are made internally. The algorithm stops once there is a high probability that the vast majority of identifications have been made correctly and consistently. This means that the graph algorithm requires little expertise to use and can be thought of as an ``identification wizard'' that simply guides the user through a set of simple questions. This design allows the graph algorithm to be run on a web server, where requests for manual interactions can be sent to remote users and quickly done in a web browser. %\section{Discussion of the ranking and verification algorithm} %pass \section{CONTRIBUTIONS}\label{sec:contributions} A summary of the contributions made in this \thesis{} are as follows: \begin{enumln} \item {The ranking algorithm}: \begin{enumln} \item We have adapted LNBNN~\cite{mccann_local_2012} to the problem of individual animal identification. We have performed experiments that demonstrate the effect of several parameters at multiple database sizes. We have shown that tripling the number of annotations in a database can reduce the ranking accuracy at rank $1$ by $2\percent$. \item We have evaluated the effect of various levels of feature invariance in our experiments. We have introduced a heuristic that augments the orientation of query keypoints to account for pose variations. For plains zebras, this can improve the ranking accuracy at rank $1$ by $7\percent$. \item We have accounted for the influence of background features using a learned a foregroundness measure to weight the LNBNN scores of feature correspondences. We have empirically shown that this procedure can increase the ranking accuracy at rank $1$ by $5\percent$. \item We have demonstrated the impact of image redundancy and the importance of collecting more than one annotation in each encounter. Our experiments show that multiple exemplars per name can significantly increase the ranking accuracy at rank $1$ by $20\percent$. \item We have developed a \name{} scoring mechanism to take advantage of information in database names with multiple exemplars. We have shown that this can increase the ranking accuracy at rank $1$ by $1\percent$ when there are multiple exemplars per name. \end{enumln} \item {The pairwise classification algorithm}: \begin{enumln} \item We have developed a novel feature vector that represents local and global matching information between two annotations. Our experiments have shown that both the local and global feature dimensions are important for predicting if two annotations match. \item We have used this feature vector to learn a random forest that can predict the probability that two annotations are either positive, negative, or incomparable. We have shown that this learned pairwise classifier is a strong predictor of match-state by measuring an MCC of $0.83$ for plains zebras and $0.91$ for Grévy's zebras. \item We have compared the learned probabilities to LNBNN scores and demonstrated that re-ranking using the positive probabilities can improve the ranking accuracy at rank $1$ by $9\percent$ for plains zebras and $2\percent$ for Grévy's zebras. Additionally, the probabilities significantly improve the separation of positive and non-positive annotation pairs. For both species, an ROC AUC of less than $0.9$ is improved to an AUC greater than $0.97$. \end{enumln} \item {The graph identification algorithm}: \begin{enumln} \item We have demonstrated that combining the graph algorithm with existing ranking and verification algorithms improves the accuracy and efficiency of semi-automatic animal identification. We have designed the framework to be agnostic to the specific ranking and verification algorithms so future DCNN-based algorithms can be swapped in. \item We have proposed a measure of redundancy based on edge-connectivity used to increase accuracy and reduce the number of reviews needed. \item We have developed an algorithm for fixing errors whenever inconsistencies in the graph are been discovered. \item We have developed a probabilistic termination criteria that determines when to stop identification. \end{enumln} \end{enumln} \section{FUTURE WORK}\label{sec:futurework} We have shown that our ranking and match-state classification algorithms are both accurate and work well for identifying animals. However, the clearest direction for future research is to replace these algorithms with ones based on DCNNs. To replace the ranking algorithm, we believe that the approach in~\cite{arandjelovic_netvlad_2016} is a good starting point. We had briefly investigated replacing the pairwise classifier using the techniques in~\cite{taigman_deepface_2014}, but the results were poor because we did not have as much training data or an alignment procedure. Research into the geometric matching technique described in~\cite{rocco_convolutional_2017} may help address both of these issues. There are also improvements that can be made to the graph algorithm. First it would be useful to parallelize the algorithm so reviews could be distributed across multiple users. This can be obtained by popping multiple edges from the queue at a time, but this could add extraneous redundancy if one edge in the popped set would have been filtered by another. Second, the current prioritization of edges is based completely on the output of the pairwise classifier. In the best case, the ordering would first construct each PCC as a chain, and then only $1$ redundant review would be needed. In the worst case, this order would connect one annotation of an individual to all others causing a star shaped PCC. Then to make the PCC $2$-positive-redundant, it would take $n - 2$ reviews, where $n$ is the number of annotations in the PCC. Determining the best order in which to review edges depending on the specified level of redundancy is an interesting question, which is perhaps made more challenging if considered in a distributed setting. % L___ CHAPTER ___