crowdkit.aggregation.embeddings.closest_to_average.ClosestToAverage
| Source code
ClosestToAverage(self, distance: Callable[[..., ...], float])
Closest to Average - chooses the output with the embedding closest to the average embedding.
This method takes a DataFrame
containing four columns: task
, worker
, output
, and embedding
.
Here the embedding
is a vector containing a representation of the output
. The output
might be any
type of data such as text, images, NumPy arrays, etc. As the result, the method returns the output which
embedding is the closest one to the average embedding of the task's responses.
Parameters | Type | Description |
---|---|---|
distance | Callable[[..., ...], float] | A callable that takes two NumPy arrays and returns a single |
embeddings_and_outputs_ | DataFrame | Tasks' embeddings and outputs. A pandas.DataFrame indexed by |
scores_ | DataFrame | Tasks' label scores. A pandas.DataFrame indexed by |
Method | Description |
---|---|
fit | Fits the model. |
fit_predict | Fit the model and return the aggregated results. |
fit_predict_scores | Fit the model and return the estimated scores. |