k-Nearest Neighbors Classification (k-NN)
Contents
k-Nearest Neighbors Classification (k-NN)#
Operation |
Computational methods |
Programming Interface |
|||
Mathematical formulation#
Training#
Let
Training method: brute-force#
The training operation produces the model that stores all the feature vectors
from the initial training set
Training method: k-d tree#
The training operation builds a
Inference#
Let
Identify the set
of the feature vectors in the training set that are nearest to with respect to the Euclidean distance.Estimate the conditional probability for the
-th class as the fraction of vectors in whose labels are equal to :(1)#Predict the class that has the highest probability for the feature vector
:(2)#
Inference method: brute-force#
Brute-force inference method determines the set
Inference method: k-d tree#
K-d tree inference method traverses the
Usage example#
Training#
knn::model<> run_training(const table& data,
const table& labels) {
const std::int64_t class_count = 10;
const std::int64_t neighbor_count = 5;
const auto knn_desc = knn::descriptor<float>{class_count, neighbor_count};
const auto result = train(knn_desc, data, labels);
return result.get_model();
}
Inference#
table run_inference(const knn::model<>& model,
const table& new_data) {
const std::int64_t class_count = 10;
const std::int64_t neighbor_count = 5;
const auto knn_desc = knn::descriptor<float>{class_count, neighbor_count};
const auto result = infer(knn_desc, model, new_data);
print_table("labels", result.get_labels());
}
Programming Interface#
All types and functions in this section shall be declared in the
oneapi::dal::knn
namespace and be available via inclusion of the
oneapi/dal/algo/knn.hpp
header file.
Descriptor#
template <typename Float = float,
typename Method = method::by_default,
typename Task = task::by_default>
class descriptor {
public:
explicit descriptor(std::int64_t class_count,
std::int64_t neighbor_count);
std::int64_t get_class_count() const;
descriptor& set_class_count(std::int64_t);
std::int64_t get_neighbor_count() const;
descriptor& set_neighbor_count(std::int64_t);
};
-
template<typename Float = float, typename Method = method::by_default, typename Task = task::by_default>
class descriptor# - Template Parameters
Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
Method – Tag-type that specifies an implementation of algorithm. Can be method::bruteforce or method::kd_tree.
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
Constructors
-
descriptor(std::int64_t class_count, std::int64_t neighbor_count)#
Creates a new instance of the class with the given
class_count
andneighbor_count
property values.
Properties
-
std::int64_t class_count#
The number of classes
.- Getter & Setter
std::int64_t get_class_count() const
descriptor & set_class_count(std::int64_t)
- Invariants
- class_count > 1
-
std::int64_t neighbor_count#
The number of neighbors
.- Getter & Setter
std::int64_t get_neighbor_count() const
descriptor & set_neighbor_count(std::int64_t)
- Invariants
- neighbor_count > 0
Model#
template <typename Task = task::by_default>
class model {
public:
model();
};
-
template<typename Task = task::by_default>
class model# - Template Parameters
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
Constructors
-
model()#
Creates a new instance of the class with the default property values.
Training train(...)#
Input#
template <typename Task = task::by_default>
class train_input {
public:
train_input(const table& data = table{},
const table& labels = table{});
const table& get_data() const;
train_input& set_data(const table&);
const table& get_labels() const;
train_input& set_labels(const table&);
};
-
template<typename Task = task::by_default>
class train_input# - Template Parameters
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
Constructors
-
train_input(const table &data = table{}, const table &labels = table{})#
Creates a new instance of the class with the given
data
andlabels
property values.
Properties
Result#
template <typename Task = task::by_default>
class train_result {
public:
train_result();
const model<Task>& get_model() const;
};
-
template<typename Task = task::by_default>
class train_result# - Template Parameters
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
Constructors
-
train_result()#
Creates a new instance of the class with the default property values.
Public Methods
Operation#
template <typename Float, typename Method, typename Task>
train_result<Task> train(const descriptor<Float, Method, Task>& desc,
const train_input<Task>& input);
-
template<typename Float, typename Method, typename Task>
train_result<Task> train(const descriptor<Float, Method, Task> &desc, const train_input<Task> &input)# Runs the training operation for
-NN classifier. For more details see oneapi::dal::train.- Template Parameters
Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
Method – Tag-type that specifies an implementation of algorithm. Can be method::bruteforce or method::kd_tree.
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
- Parameters
desc – Descriptor of the algorithm.
input – Input data for the training operation.
- Preconditions
Inference infer(...)#
Input#
template <typename Task = task::by_default>
class infer_input {
public:
infer_input(const model<Task>& m = model<Task>{},
const table& data = table{});
const model<Task>& get_model() const;
infer_input& set_model(const model&);
const table& get_data() const;
infer_input& set_data(const table&);
};
-
template<typename Task = task::by_default>
class infer_input# - Template Parameters
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
Constructors
-
infer_input(const model<Task> &m = model<Task>{}, const table &data = table{})#
Creates a new instance of the class with the given
model
anddata
property values.
Properties
Result#
template <typename Task = task::by_default>
class infer_result {
public:
infer_result();
const table& get_labels() const;
};
-
template<typename Task = task::by_default>
class infer_result# - Template Parameters
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
Constructors
-
infer_result()#
Creates a new instance of the class with the default property values.
Public Methods
Operation#
template <typename Float, typename Method, typename Task>
infer_result<Task> infer(const descriptor<Float, Method, Task>& desc,
const infer_input<Task>& input);
-
template<typename Float, typename Method, typename Task>
infer_result<Task> infer(const descriptor<Float, Method, Task> &desc, const infer_input<Task> &input)# Runs the inference operation for
-NN classifier. For more details see oneapi::dal::infer.- Template Parameters
Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
Method – Tag-type that specifies an implementation of algorithm. Can be method::bruteforce or method::kd_tree.
Task – Tag-type that specifies type of the problem to solve. Can be task::classification.
- Parameters
desc – Descriptor of the algorithm.
input – Input data for the inference operation.
- Preconditions
- input.data.has_data == true
- Postconditions