DiSMEC++
dismec::DatasetBase Class Referenceabstract

#include <data.h>

Inheritance diagram for dismec::DatasetBase:
dismec::BinaryData dismec::MultiLabelData

Public Member Functions

virtual ~DatasetBase ()=default
 
 DatasetBase (const DatasetBase &)=default
 
 DatasetBase (DatasetBase &&)=default
 
DatasetBaseoperator= (DatasetBase &&)=default
 
DatasetBaseoperator= (const DatasetBase &)=default
 
std::shared_ptr< const GenericFeatureMatrixget_features () const
 get a shared pointer to the (immutable) feature data More...
 
std::shared_ptr< GenericFeatureMatrixedit_features ()
 get a shared pointer to mutable feature data. Use with care. More...
 
long num_features () const noexcept
 Get the total number of features, i.e. the number of columns in the feature matrix. More...
 
long num_examples () const noexcept
 Get the total number of instances, i.e. the number of rows in the feature matrix. More...
 
virtual long num_labels () const noexcept=0
 
virtual long num_positives (label_id_t id) const
 
virtual long num_negatives (label_id_t id) const
 
std::shared_ptr< const BinaryLabelVectorget_labels (label_id_t id) const
 
virtual void get_labels (label_id_t id, Eigen::Ref< BinaryLabelVector > target) const =0
 

Protected Member Functions

 DatasetBase (SparseFeatures x)
 
 DatasetBase (DenseFeatures x)
 

Protected Attributes

std::shared_ptr< GenericFeatureMatrixm_Features
 

Detailed Description

Definition at line 15 of file data.h.

Constructor & Destructor Documentation

◆ ~DatasetBase()

virtual dismec::DatasetBase::~DatasetBase ( )
virtualdefault

◆ DatasetBase() [1/4]

dismec::DatasetBase::DatasetBase ( const DatasetBase )
default

◆ DatasetBase() [2/4]

dismec::DatasetBase::DatasetBase ( DatasetBase &&  )
default

◆ DatasetBase() [3/4]

DatasetBase::DatasetBase ( SparseFeatures  x)
explicitprotected

Definition at line 56 of file data.cpp.

◆ DatasetBase() [4/4]

DatasetBase::DatasetBase ( DenseFeatures  x)
explicitprotected

Definition at line 57 of file data.cpp.

Member Function Documentation

◆ edit_features()

std::shared_ptr< GenericFeatureMatrix > DatasetBase::edit_features ( )

◆ get_features()

std::shared_ptr< const GenericFeatureMatrix > DatasetBase::get_features ( ) const

get a shared pointer to the (immutable) feature data

Definition at line 39 of file data.cpp.

References m_Features.

Referenced by anonymous_namespace{py_data.cpp}::get_features(), join_data(), and dismec::io::save_xmc_dataset().

◆ get_labels() [1/2]

std::shared_ptr< const BinaryLabelVector > DatasetBase::get_labels ( label_id_t  id) const

Gets the label vector (encoded as dense vector with elements from {-1, 1}) for the id'th class. Throws std::out_of_bounds, if id is not in [0, num_labels()).

Definition at line 21 of file data.cpp.

References num_examples().

Referenced by anonymous_namespace{py_data.cpp}::get_labels(), num_positives(), dismec::CascadeTraining::update_minimizer(), dismec::CascadeTraining::update_objective(), and dismec::DiSMECTraining::update_objective().

◆ get_labels() [2/2]

virtual void dismec::DatasetBase::get_labels ( label_id_t  id,
Eigen::Ref< BinaryLabelVector target 
) const
pure virtual

Gets the label vector (encoded as dense vector with elements from {-1, 1}) for the id'th class. The weights will be put into the given target buffer. Throws std::out_of_bounds, if id is not in [0, num_labels()).

Implemented in dismec::MultiLabelData, and dismec::BinaryData.

◆ num_examples()

◆ num_features()

long DatasetBase::num_features ( ) const
noexcept

Get the total number of features, i.e. the number of columns in the feature matrix.

Definition at line 48 of file data.cpp.

References m_Features.

Referenced by dismec::TrainingSpec::num_features(), dismec::prediction::PredictionBase::PredictionBase(), and dismec::io::save_xmc_dataset().

◆ num_labels()

virtual long dismec::DatasetBase::num_labels ( ) const
pure virtualnoexcept

◆ num_negatives()

long DatasetBase::num_negatives ( label_id_t  id) const
virtual

Gets the number of instances where label id is absent (=-1) Throws std::out_of_bounds, if id is not in [0, num_labels()).

Reimplemented in dismec::MultiLabelData.

Definition at line 17 of file data.cpp.

References num_examples(), and num_positives().

Referenced by anonymous_namespace{py_data.cpp}::num_negatives().

◆ num_positives()

long DatasetBase::num_positives ( label_id_t  id) const
virtual

Gets the number of instances where label id is present (=+1) Throws std::out_of_bounds, if id is not in [0, num_labels()).

Reimplemented in dismec::MultiLabelData.

Definition at line 13 of file data.cpp.

References get_labels().

Referenced by dismec::PropensityModel::get_propensity(), num_negatives(), anonymous_namespace{py_data.cpp}::num_positives(), dismec::CascadeTraining::update_minimizer(), and dismec::DiSMECTraining::update_minimizer().

◆ operator=() [1/2]

DatasetBase& dismec::DatasetBase::operator= ( const DatasetBase )
default

◆ operator=() [2/2]

DatasetBase& dismec::DatasetBase::operator= ( DatasetBase &&  )
default

Member Data Documentation

◆ m_Features

std::shared_ptr<GenericFeatureMatrix> dismec::DatasetBase::m_Features
protected

Definition at line 60 of file data.h.

Referenced by edit_features(), get_features(), num_examples(), and num_features().


The documentation for this class was generated from the following files: