The basics of scikit-query
This page explains how to install the library and use it in Python.
Installation
The library can be installed using pip
:
pip install scikit-query
Using the library
The library has modules for each kind of constraint (pairwise or triplet), plus another one containing the oracles. They can be imported as below :
from skquery.pairwise import *
from skquery.triplet import *
from skquery.oracle import MLCLOracle
This will allow to use the constraint selection algorithms as well
as the MLCLOracle
to answer queries about pairwise constraints.
Making queries
All algorithms have a fit
method taking as arguments
a matrix of n points having m features and an oracle (typically from the skquery.oracle
module).
The oracle must have a query
method returning a boolean.
qs = AIPC()
oracle = MLCLOracle()
constraints = qs.fit(dataset.data, oracle)
The oracle’s truth
attribute can support a ground truth labeling of the data,
which will be used to automatically answer queries.
If none is provided, it will ask queries to the user through the CLI.
oracle = MLCLOracle(truth=labels)
The constraints are returned as a dictionary of constraint types paired with lists of selected constraints. The table below describes how the constraint dictionary is structured.
Type |
Key |
Constraint format |
---|---|---|
Must-link |
ml |
(int, int) |
Cannot-link |
cl |
(int, int) |
Triplet |
triplet |
(int, int, int) |