Alien-XGBoost

 view release on metacpan or  search on metacpan

xgboost/cub/cub/thread/thread_load.cuh  view on Meta::CPAN

//-----------------------------------------------------------------------------

/**
 * \brief Enumeration of cache modifiers for memory load operations.
 */
enum CacheLoadModifier
{
    LOAD_DEFAULT,       ///< Default (no modifier)
    LOAD_CA,            ///< Cache at all levels
    LOAD_CG,            ///< Cache at global level
    LOAD_CS,            ///< Cache streaming (likely to be accessed once)
    LOAD_CV,            ///< Cache as volatile (including cached system lines)
    LOAD_LDG,           ///< Cache as texture
    LOAD_VOLATILE,      ///< Volatile (any memory space)
};


/**
 * \name Thread I/O (cache modified)
 * @{
 */

xgboost/dmlc-core/include/dmlc/data.h  view on Meta::CPAN

        CHECK(index[i] < size) << "feature index exceed bound";
        sum += weight[index[i]] * value[i];
      }
    }
    return sum;
  }
};

/*!
 * \brief a block of data, containing several rows in sparse matrix
 *  This is useful for (streaming-sxtyle) algorithms that scans through rows of data
 *  examples include: SGD, GD, L-BFGS, kmeans
 *
 *  The size of batch is usually large enough so that parallelizing over the rows
 *  can give significant speedup
 * \tparam IndexType type to store the index used in row batch
 */
template<typename IndexType>
struct RowBlock {
  /*! \brief batch size */
  size_t size;

xgboost/doc/jvm/xgboost4j_full_integration.md  view on Meta::CPAN

The integrations with Spark/Flink, a.k.a. <b>XGBoost4J-Spark</b> and <b>XGBoost-Flink</b>, receive the tremendous positive feedbacks from the community. It enables users to build a unified pipeline, embedding  XGBoost into the data processing system ...

![XGBoost4J Architecture](https://raw.githubusercontent.com/dmlc/web-data/master/xgboost/unified_pipeline.png)

In the last months, we have a lot of communication with the users and gain the deeper understanding of the users' latest usage scenario and requirements:

* XGBoost keeps gaining more and more deployments in the production environment and the adoption in machine learning competitions [Link](http://datascience.la/xgboost-workshop-and-meetup-talk-with-tianqi-chen/).

* While Spark is still the mainstream data processing tool in most of scenarios, more and more users are porting their RDD-based Spark programs to [DataFrame/Dataset APIs](http://spark.apache.org/docs/latest/sql-programming-guide.html) for the well-d...

* Spark itself has presented a clear roadmap that DataFrame/Dataset would be the base of the latest and future features, e.g. latest version of [ML pipeline](http://spark.apache.org/docs/latest/ml-guide.html) and [Structured Streaming](http://spark.a...

Based on these feedbacks from the users, we observe a gap between the original RDD-based XGBoost4J-Spark and the users' latest usage scenario as well as the future direction of Spark ecosystem. To fill this gap, we start working on the <b><i>integrat...


## A Full Integration of XGBoost and DataFrame/Dataset

The following figure illustrates the new pipeline architecture with the latest XGBoost4J-Spark.

![XGBoost4J New Architecture](https://raw.githubusercontent.com/dmlc/web-data/master/xgboost/unified_pipeline_new.png)

 view all matches for this distribution
 view release on metacpan -  search on metacpan

( run in 0.853 second using v1.00-cache-2.02-grep-82fe00e-cpan-dad7e4baca0 )