% +--- CHAPTER --- 
\begin{comment}
    ./texfix.py --fpaths chapter6-conclusion.tex --outline --asmarkdown --numlines=99 -w
    fixtex --fpaths chapter6-conclusion.tex --outline --asmarkdown --numlines=999 --shortcite
\end{comment}

\chapter{CONCLUSION}\label{chap:conclusion} 

    In this \thesis{} we have addressed the problem of identifying individual animals from images.
    We have demonstrated that our approach is effective for identifying plains zebras, Grévy's zebras, Masai
      giraffes, and humpback whales.
    Our approach consists of three main components:
    (1) the ranking algorithm from \cref{chap:ranking} that uses a bounding box annotation around an animal to
      search a labeled database of annotations for likely matches,
    (2) the classification algorithm from \cref{chap:pairclf} that probabilistically verifies if a pair of
      annotations is positive, negative, or incomparable, and
    (3) the graph framework from \cref{chap:graphid} that harnesses the previous algorithms in a principled way
      to dynamically determine the identity of all animals in a dataset.
    Each of these algorithms was designed to build on the previous one(s), improving the overall accuracy and
      efficiency of the counting process.
    In \cref{sec:graphexpt} we demonstrated that this was indeed the case.

    By combining these algorithms we have made several meaningful contributions to the problem of animal
      identification.
    In \cref{sec:introgzc} we discussed the Great Zebra Count (\GZC{}), where the ranking algorithm was used in
      combination with the effort of citizen scientists to provide an estimate of the number of plains zebras and
      Masai giraffes in Nairobi National Park.
    In \cref{sec:rankexpt} we investigated several parameters and factors that can impact the performance of the
      ranking algorithm.
    We discovered that having multiple photos of each individual significantly improves the accuracy of the
      ranking algorithm and we designed a novel name scoring mechanism with this in mind.
    In \cref{sec:pairexpt} we demonstrated that a classification algorithm can be used to improve the separation
      of positive results from negative and incomparable results in a ranked list.
    In \cref{sec:graphexpt} we simulated the \GZC{} and demonstrated that our improvements to the ranking
      algorithm --- made by the classification and graph algorithm --- enable us to perform identification using
      less than $25\percent$ of the number of manual reviews required by the original event.

    \section{DISCUSSION}\label{sec:discuss}

    The research that resulted in this \thesis{} began in $2010$ and was completed in $2017$.
    During that time, many significant developments were made in the fields of computer vision and machine
      learning, most notably the explosion of deep learning~\cite{lecun_deep_2015}.
    While some steps in our approach (\eg{} the foregroundness weights) do make use of deep convolutional neural
      networks (DCNNs), most do not.
    In some sense this is an advantage because the algorithms can be applied to different species without any
      need for pre-training, but this also means they do not obtain the level of accuracy shown to be achievable by
      these networks.
    Yet, the contributions of this \thesis{} are still relevant and complementary to DCNNs.
    This is trivially true in the case of the ranking and classification algorithms, in part due to the
      aforementioned reasons.
    However, the contribution of the graph algorithm is relevant, even in the era of deep learning.
    
    %\section{Discussion of the graph algorithm}
    The graph identification algorithm models the abstract constraints of the identification problem and provides
      a framework that can efficiently harness the power of any ranking or verification algorithm, whether it be
      deep or shallow.
    The framework dynamically manages the relationships between annotations.
    In most cases this means deciding if two annotations are the same (positive) or different (negative), but
      this also means handling cases like when the annotations are incomparable or when there is some other
      interesting connection between two annotations like scenery matches and photobombs.
    As new relationships are added, errors are discovered and corrected, and the identifications are updated.

    The framework also provides a means of prioritizing which edges need to be reviewed based on
    (1) the underlying computer vision algorithms,
    (2) the edge-augmentation needed to ensure minimum redundancy, and
    (3) the minimum cut needed to correct an error and split an inconsistent individual.
    Edge prioritization works in conjunction with a convergence criteria that determines when identification has
      been completed.
    A signal is emitted whenever manual interaction is needed, and the algorithm continues after the user returns
      with a response.
    The only time a user interacts with the algorithm after it begins is to label an edge as positive, negative,
      or incomparable.
    All other decisions are made internally.
    The algorithm stops once there is a high probability that the vast majority of identifications have been made
      correctly and consistently.
    This means that the graph algorithm requires little expertise to use and can be thought of as an
      ``identification wizard'' that simply guides the user through a set of simple questions.
    This design allows the graph algorithm to be run on a web server, where requests for manual interactions can
      be sent to remote users and quickly done in a web browser.

    %\section{Discussion of the ranking and verification algorithm}
    %pass
    \section{CONTRIBUTIONS}\label{sec:contributions}

    A summary of the contributions made in this \thesis{} are as follows:

    \begin{enumln}
    \item {The ranking algorithm}:
        \begin{enumln}
        \item
        We have adapted LNBNN~\cite{mccann_local_2012} to the problem of individual animal identification.
        We have performed experiments that demonstrate the effect of several parameters at multiple database
          sizes.
        We have shown that tripling the number of annotations in a database can reduce the ranking accuracy at
          rank $1$ by $2\percent$.

        \item
        We have evaluated the effect of various levels of feature invariance in our experiments.
        We have introduced a heuristic that augments the orientation of query keypoints to account for pose
          variations.
        For plains zebras, this can improve the ranking accuracy at rank $1$ by $7\percent$.

        \item
        We have accounted for the influence of background features using a learned a foregroundness measure to
          weight the LNBNN scores of feature correspondences.
        We have empirically shown that this procedure can increase the ranking accuracy at rank $1$ by
          $5\percent$.


        \item We have demonstrated the impact of image redundancy and the importance of collecting more than one
          annotation in each encounter.
        Our experiments show that multiple exemplars per name can significantly increase the ranking accuracy at
          rank $1$ by $20\percent$.

        \item We have developed a \name{} scoring mechanism to take advantage of information in database names
          with multiple exemplars.
        We have shown that this can increase the ranking accuracy at rank $1$ by $1\percent$ when there are
          multiple exemplars per name.
        \end{enumln}

    \item {The pairwise classification algorithm}:
        \begin{enumln}

        \item We have developed a novel feature vector that represents local and global matching information
          between two annotations.
        Our experiments have shown that both the local and global feature dimensions are important for predicting
          if two annotations match.

        \item We have used this feature vector to learn a random forest that can predict the probability that two
          annotations are either positive, negative, or incomparable.
        We have shown that this learned pairwise classifier is a strong predictor of match-state by measuring an
          MCC of $0.83$ for plains zebras and $0.91$ for Grévy's zebras.

        \item We have compared the learned probabilities to LNBNN scores and demonstrated that re-ranking using
          the positive probabilities can improve the ranking accuracy at rank $1$ by $9\percent$ for plains zebras
          and $2\percent$ for Grévy's zebras.
        Additionally, the probabilities significantly improve the separation of positive and non-positive
          annotation pairs.
        For both species, an ROC AUC of less than $0.9$ is improved to an AUC greater than $0.97$.
          
        \end{enumln}

    \item {The graph identification algorithm}:
        \begin{enumln}

        \item 
            We have demonstrated that combining the graph algorithm with existing ranking and verification
              algorithms improves the accuracy and efficiency of semi-automatic animal identification.
            We have designed the framework to be agnostic to the specific ranking and verification algorithms so
              future DCNN-based algorithms can be swapped in.

        \item We have proposed a measure of redundancy based on edge-connectivity used to increase accuracy and
          reduce the number of reviews needed.

        \item We have developed an algorithm for fixing errors whenever inconsistencies in the graph are been
          discovered.

        \item We have developed a probabilistic termination criteria that determines when to stop identification.
        \end{enumln}
    \end{enumln}

    \section{FUTURE WORK}\label{sec:futurework}

    We have shown that our ranking and match-state classification algorithms are both accurate and work well for
      identifying animals.
    However, the clearest direction for future research is to replace these algorithms with ones based on DCNNs.
    To replace the ranking algorithm, we believe that the approach in~\cite{arandjelovic_netvlad_2016} is a good
      starting point.
    We had briefly investigated replacing the pairwise classifier using the techniques
      in~\cite{taigman_deepface_2014}, but the results were poor because we did not have as much training data or
      an alignment procedure.
    Research into the geometric matching technique described in~\cite{rocco_convolutional_2017} may help address
      both of these issues.

    There are also improvements that can be made to the graph algorithm.
    First it would be useful to parallelize the algorithm so reviews could be distributed across multiple users.
    This can be obtained by popping multiple edges from the queue at a time, but this could add extraneous
      redundancy if one edge in the popped set would have been filtered by another.
    Second, the current prioritization of edges is based completely on the output of the pairwise classifier.
    In the best case, the ordering would first construct each PCC as a chain, and then only $1$ redundant review
      would be needed.
    In the worst case, this order would connect one annotation of an individual to all others causing a star
      shaped PCC.
    Then to make the PCC $2$-positive-redundant, it would take $n - 2$ reviews, where $n$ is the number of
      annotations in the PCC.
    Determining the best order in which to review edges depending on the specified level of redundancy is an
      interesting question, which is perhaps made more challenging if considered in a distributed setting.

% L___ CHAPTER ___