public abstract class FrameTask<T extends FrameTask<T>>
extends water.MRTask<T>
Modifier and Type | Class and Description |
---|---|
static class |
FrameTask.ExtractDenseRow |
Modifier and Type | Field and Description |
---|---|
protected DataInfo |
_dinfo |
protected water.Key<water.Job> |
_jobKey |
protected boolean |
_shuffle |
protected boolean |
_sparse |
protected float |
_useFraction |
Constructor and Description |
---|
FrameTask(water.Key<water.Job> jobKey,
DataInfo dinfo) |
FrameTask(water.Key<water.Job> jobKey,
DataInfo dinfo,
long seed,
int iteration,
boolean sparse) |
FrameTask(water.Key<water.Job> jobKey,
DataInfo dinfo,
long seed,
int iteration,
boolean sparse,
water.H2O.H2OCountedCompleter cmp) |
Modifier and Type | Method and Description |
---|---|
protected void |
chunkDone(long n)
Override this to do post-chunk processing work.
|
protected boolean |
chunkInit()
Override this to initialize at the beginning of chunk processing.
|
protected void |
closeLocal() |
DataInfo |
dinfo() |
protected int |
getMiniBatchSize()
Note: If this is overridden, then applyMiniBatch must be overridden as well to perform the model/weight mini-batch update
|
void |
map(water.fvec.Chunk[] chunks,
water.fvec.NewChunk[] outputs)
Extracts the values, applies regularization to numerics, adds appropriate offsets to categoricals,
and adapts response according to the CaseMode/CaseValue if set.
|
protected void |
processMiniBatch(long seed,
double[] responses,
double[] offsets,
int n)
Mini-Batch update of model parameters
|
protected void |
processRow(long gid,
DataInfo.Row r)
Method to process one row of the data.
|
protected void |
processRow(long gid,
DataInfo.Row r,
int mb) |
protected void |
processRow(long gid,
DataInfo.Row r,
water.fvec.NewChunk[] outputs) |
protected void |
setupLocal() |
protected boolean |
skipRow(long gid) |
appendables, asyncExecOnAllNodes, block, compute2, dfork, dfork, dfork, dfork, dfork, dinvoke, doAll, doAll, doAll, doAll, doAll, doAll, doAll, doAll, doAll, doAll, doAll, doAll, doAllNodes, getResult, getResult, isReleasable, map, map, map, map, map, map, map, map, map, map, map, modifiesVolatileVecs, onCompletion, onExceptionalCompletion, outputFrame, outputFrame, outputFrame, postGlobal, profile, profString, reduce, self, withPostMapAction
copyOver, getDException, hasException, logVerbose, onAck, onAckAck, setException
asBytes, clone, compute, compute1, currThrPriority, frozenType, icer, priority, read, readJSON, reloadFromBytes, write, writeJSON
__tryComplete, addToPendingCount, compareAndSetPendingCount, complete, exec, getCompleter, getPendingCount, getRawResult, setCompleter, setPendingCount, setRawResult, tryComplete
adapt, adapt, adapt, cancel, compareAndSetForkJoinTaskTag, completeExceptionally, fork, get, get, get, getException, getForkJoinTaskTag, getPool, getQueuedTaskCount, getSurplusQueuedTaskCount, helpQuiesce, inForkJoinPool, invoke, invokeAll, invokeAll, invokeAll, isCancelled, isCompletedAbnormally, isCompletedNormally, isDone, join, peekNextLocalTask, pollNextLocalTask, pollTask, quietlyComplete, quietlyInvoke, quietlyJoin, reinitialize, setForkJoinTaskTag, tryUnfork
protected boolean _sparse
protected transient DataInfo _dinfo
protected final water.Key<water.Job> _jobKey
protected float _useFraction
protected boolean _shuffle
public FrameTask(water.Key<water.Job> jobKey, DataInfo dinfo)
public FrameTask(water.Key<water.Job> jobKey, DataInfo dinfo, long seed, int iteration, boolean sparse)
public FrameTask(water.Key<water.Job> jobKey, DataInfo dinfo, long seed, int iteration, boolean sparse, water.H2O.H2OCountedCompleter cmp)
public DataInfo dinfo()
protected void setupLocal()
protected void closeLocal()
protected void processRow(long gid, DataInfo.Row r)
gid
- - global id of this row, in [0,_adaptedFrame.numRows())protected void processRow(long gid, DataInfo.Row r, water.fvec.NewChunk[] outputs)
protected void processRow(long gid, DataInfo.Row r, int mb)
protected boolean skipRow(long gid)
protected void processMiniBatch(long seed, double[] responses, double[] offsets, int n)
seed
- responses
- offsets
- n
- actual number of rows in this minibatchprotected int getMiniBatchSize()
protected boolean chunkInit()
protected void chunkDone(long n)
n
- Number of processed rowspublic void map(water.fvec.Chunk[] chunks, water.fvec.NewChunk[] outputs)