Alien-XGBoost

 view release on metacpan or  search on metacpan

xgboost/R-package/src/xgboost_R.h  view on Meta::CPAN

 * \param missing which value to represent missing value
 * \return created dmatrix
 */
XGB_DLL SEXP XGDMatrixCreateFromMat_R(SEXP mat,
                                      SEXP missing);
/*!
 * \brief create a matrix content from CSC format
 * \param indptr pointer to column headers
 * \param indices row indices
 * \param data content of the data
 * \param num_row numer of rows (when it's set to 0, then guess from data)
 * \return created dmatrix
 */
XGB_DLL SEXP XGDMatrixCreateFromCSC_R(SEXP indptr,
                                      SEXP indices,
                                      SEXP data,
                                      SEXP num_row);

/*!
 * \brief create a new dmatrix from sliced content of existing matrix
 * \param handle instance of data matrix to be sliced

xgboost/demo/kaggle-otto/understandingXGBoostModel.Rmd  view on Meta::CPAN

* remove `1` to the new value

```{r classToIntegers}
# Convert from classes to numbers
y <- train[, nameLastCol, with = F][[1]] %>% gsub('Class_','',.) %>% {as.integer(.) -1}

# Display the first 5 levels
y[1:5]
```

We remove label column from training dataset, otherwise **XGBoost** would use it to guess the labels!

```{r deleteCols, results='hide'}
train[, nameLastCol:=NULL, with = F]
```

`data.table` is an awesome implementation of data.frame, unfortunately it is not a format supported natively by **XGBoost**. We need to convert both datasets (training and test) in `numeric` Matrix format.

```{r convertToNumericMatrix}
trainMatrix <- train[,lapply(.SD,as.numeric)] %>% as.matrix
testMatrix <- test[,lapply(.SD,as.numeric)] %>% as.matrix

xgboost/demo/regression/machine.names  view on Meta::CPAN

    - Predicted attribute: cpu relative performance (numeric)

4. Relevant Information:
   -- The estimated relative performance values were estimated by the authors
      using a linear regression method.  See their article (pp 308-313) for
      more details on how the relative performance values were set.

5. Number of Instances: 209 

6. Number of Attributes: 10 (6 predictive attributes, 2 non-predictive, 
                             1 goal field, and the linear regression's guess)

7. Attribute Information:
   1. vendor name: 30 
      (adviser, amdahl,apollo, basf, bti, burroughs, c.r.d, cambex, cdc, dec, 
       dg, formation, four-phase, gould, honeywell, hp, ibm, ipl, magnuson, 
       microdata, nas, ncr, nixdorf, perkin-elmer, prime, siemens, sperry, 
       sratus, wang)
   2. Model Name: many unique symbols
   3. MYCT: machine cycle time in nanoseconds (integer)
   4. MMIN: minimum main memory in kilobytes (integer)

xgboost/dmlc-core/tracker/dmlc_tracker/opts.py  view on Meta::CPAN

    parser.add_argument('--queue', default='default', type=str,
                        help='The submission queue the job should goes to.')
    parser.add_argument('--log-level', default='INFO', type=str,
                        choices=['INFO', 'DEBUG'],
                        help='Logging level of the logger.')
    parser.add_argument('--log-file', default=None, type=str,
                        help=('Output log to the specific log file, ' +
                              'the log is still printed on stderr.'))
    parser.add_argument('--host-ip', default=None, type=str,
                        help=('Host IP addressed, this is only needed ' +
                              'if the host IP cannot be automatically guessed.'))
    parser.add_argument('--hdfs-tempdir', default='/tmp', type=str,
                        help=('Temporary directory in HDFS, ' +
                              ' only needed in YARN mode.'))
    parser.add_argument('--host-file', default=None, type=str,
                        help=('The file contains the list of hostnames, needed for MPI and ssh.'))
    parser.add_argument('--sge-log-dir', default=None, type=str,
                        help=('Log directory of SGD jobs, only needed in SGE mode.'))
    parser.add_argument(
        '--auto-file-cache', default=True, type=bool,
        help=('Automatically cache files appeared in the command line' +

xgboost/dmlc-core/tracker/dmlc_tracker/tracker.py  view on Meta::CPAN


def main():
    """Main function if tracker is executed in standalone mode."""
    parser = argparse.ArgumentParser(description='Rabit Tracker start.')
    parser.add_argument('--num-workers', required=True, type=int,
                        help='Number of worker proccess to be launched.')
    parser.add_argument('--num-servers', default=0, type=int,
                        help='Number of server process to be launched. Only used in PS jobs.')
    parser.add_argument('--host-ip', default=None, type=str,
                        help=('Host IP addressed, this is only needed ' +
                              'if the host IP cannot be automatically guessed.'))
    parser.add_argument('--log-level', default='INFO', type=str,
                        choices=['INFO', 'DEBUG'],
                        help='Logging level of the logger.')
    args = parser.parse_args()

    fmt = '%(asctime)s %(levelname)s %(message)s'
    if args.log_level == 'INFO':
        level = logging.INFO
    elif args.log_level == 'DEBUG':
        level = logging.DEBUG

xgboost/doc/Doxyfile  view on Meta::CPAN


OPTIMIZE_OUTPUT_VHDL   = NO

# Doxygen selects the parser to use depending on the extension of the files it
# parses. With this tag you can assign which parser to use for a given
# extension. Doxygen has a built-in mapping, but you can override or extend it
# using this tag. The format is ext=language, where ext is a file extension, and
# language is one of the parsers supported by doxygen: IDL, Java, Javascript,
# C#, C, C++, D, PHP, Objective-C, Python, Fortran (fixed format Fortran:
# FortranFixed, free formatted Fortran: FortranFree, unknown formatted Fortran:
# Fortran. In the later case the parser tries to guess whether the code is fixed
# or free formatted code, this is the default for Fortran type files), VHDL. For
# instance to make doxygen treat .inc files as Fortran files (default is PHP),
# and .f files as C (default is Fortran), use: inc=Fortran f=C.
#
# Note For files without extension you can use no_extension as a placeholder.
#
# Note that for custom extensions you also need to set FILE_PATTERNS otherwise
# the files are not read by doxygen.

EXTENSION_MAPPING      =

xgboost/include/xgboost/c_api.h  view on Meta::CPAN

    const char* cache_info,
    DMatrixHandle *out);

/*!
 * \brief create a matrix content from CSR format
 * \param indptr pointer to row headers
 * \param indices findex
 * \param data fvalue
 * \param nindptr number of rows in the matrix + 1
 * \param nelem number of nonzero elements in the matrix
 * \param num_col number of columns; when it's set to 0, then guess from data
 * \param out created dmatrix
 * \return 0 when success, -1 when failure happens
 */
XGB_DLL int XGDMatrixCreateFromCSREx(const size_t* indptr,
                                     const unsigned* indices,
                                     const float* data,
                                     size_t nindptr,
                                     size_t nelem,
                                     size_t num_col,
                                     DMatrixHandle* out);

xgboost/include/xgboost/c_api.h  view on Meta::CPAN

                                   bst_ulong nindptr,
                                   bst_ulong nelem,
                                   DMatrixHandle *out);
/*!
 * \brief create a matrix content from CSC format
 * \param col_ptr pointer to col headers
 * \param indices findex
 * \param data fvalue
 * \param nindptr number of rows in the matrix + 1
 * \param nelem number of nonzero elements in the matrix
 * \param num_row number of rows; when it's set to 0, then guess from data
 * \param out created dmatrix
 * \return 0 when success, -1 when failure happens
 */
XGB_DLL int XGDMatrixCreateFromCSCEx(const size_t* col_ptr,
                                     const unsigned* indices,
                                     const float* data,
                                     size_t nindptr,
                                     size_t nelem,
                                     size_t num_row,
                                     DMatrixHandle* out);



( run in 1.376 second using v1.01-cache-2.11-cpan-702932259ff )