public class ANOVAGLMUtils
extends java.lang.Object
Constructor and Description |
---|
ANOVAGLMUtils() |
Modifier and Type | Method and Description |
---|---|
static void |
addIndividualPred(java.lang.String[] predNames,
java.util.List<java.lang.String[]> predCombo) |
static GLM[] |
buildGLMBuilders(GLMModel.GLMParameters[] glmParams) |
static GLMModel.GLMParameters[] |
buildGLMParameters(water.fvec.Frame[] trainingFrames,
ANOVAGLMModel.ANOVAGLMParameters parms) |
static water.fvec.Frame |
buildSpecificFrame(int[] predNums,
water.fvec.Frame allCols,
java.lang.String[][] transformedColNames,
ANOVAGLMModel.ANOVAGLMParameters parms)
This method is used to attach the weight/offset columns if they exist and the response columns, specific
transformed columns to a training frames.
|
static water.fvec.Frame[] |
buildTrainingFrames(water.Key<water.fvec.Frame> transformedCols,
int numberOfModels,
java.lang.String[][] transformedColNames,
ANOVAGLMModel.ANOVAGLMParameters parms)
This method will take the frame that contains transformed columns of predictor A, predictor B, interaction
of predictor A and B and generate new training frames that contains the following columns:
- transformed columns of predictor B, interaction of predictor A and B, response
- transformed columns of predictor A, interaction of predictor A and B, response
- transformed columns of predictor A, predictor B, response
- transformed columns of predictor A, predictor B, interaction of predictor A and B, response
The same logic applies if there are more than two individual predictors.
|
static int |
calculatePredComboNumber(int numPred,
int highestInteractionTerms)
Given the number of individual predictors, the highest order of interaction terms allowed, this method will
calculate the total number of predictors that will be used to build the full model.
|
static java.lang.String[] |
combineAndFlat(java.lang.String[][] predictComboNames) |
static GLMModel[] |
extractGLMModels(GLM[] glmResults)
Simple method to extract GLM Models from GLM ModelBuilders.
|
static java.lang.String[] |
extractPredNames(DataInfo dinfo,
int numOfPredictors)
This method will extract the individual predictor names that will be used to build the GLM models.
|
static void |
fillModelMetrics(ANOVAGLMModel aModel,
GLMModel glmModel,
water.fvec.Frame trainingFrame)
I copied this method from Zuzana Olajcova to add model metrics of the full GLM model as the ANOVAModel model
metrics
|
static int |
findComboMatch(java.lang.String[][] predComboNames,
int currIndex) |
static double[] |
generateGLMSS(GLMModel[] glmModels,
GLMModel.GLMParameters.Family family)
This method is used to generate Model SS for all models built except the full model.
|
static java.lang.String[] |
generateModelNames(java.lang.String[][] predictComboNames) |
static void |
generateOneCombo(java.lang.String[] predNames,
int numInteract,
java.util.List<java.lang.String[]> predCombo) |
static java.lang.String[][] |
generatePredictorCombos(java.lang.String[] predNamesIndividual,
int maxPredInt)
In order to calculate Type III SS, we need the individual predictors and their interactions.
|
static void |
generatePredictorNames(java.lang.String[][] predComboNames,
java.lang.String[][] predictorNames,
int[] predColumnStart,
int[] degreeOfFreedom,
DataInfo dinfo)
This method aims to generate the column names of the final transformed frames.
|
static int[] |
oneIndexOut(int currIndex,
int indexRange) |
static java.lang.String[] |
predCombo(java.lang.String[] predNames,
int[] predInd) |
static void |
removeFromDKV(water.fvec.Frame[] trainingFrames,
int numFrame2Delete) |
static java.lang.String[] |
transformMultipleCols(water.fvec.Frame vec2Transform,
java.lang.String[][] predComboNames,
int currIndex,
java.lang.String[][] predNames) |
static java.lang.String[] |
transformOneCol(water.fvec.Frame vec2Transform,
java.lang.String vecName)
perform data transformation described in AnovaGLMTutorial https://github.com/h2oai/h2o-3/issues/7561
section III.II on one predictor.
|
static java.lang.String[] |
transformTwoCols(water.fvec.Frame vec2Transform,
java.lang.String[] vecNames,
java.lang.String[] lastComboNames)
Generate frame transformation on two interacting columns.
|
static int |
updateDOFColInfo(int predInd,
java.lang.String[] predComboNames,
int[] dof,
int[] predCS,
int offset) |
static void |
updateLaterBits(int[] predInd,
int[] bounds,
int index,
int predNum) |
static boolean |
updatePredCombo(int[] predInd,
int[] bounds) |
public static java.lang.String[] extractPredNames(DataInfo dinfo, int numOfPredictors)
dinfo:
- DataInfo generated from dataset with all predictorsnumOfPredictors:
- number of individual predictorspublic static java.lang.String[][] generatePredictorCombos(java.lang.String[] predNamesIndividual, int maxPredInt)
predNamesIndividual:
- string containing individual predictor namesmaxPredInt:
- maximum number of predictors allowed in interaction term generationpublic static void addIndividualPred(java.lang.String[] predNames, java.util.List<java.lang.String[]> predCombo)
public static void generateOneCombo(java.lang.String[] predNames, int numInteract, java.util.List<java.lang.String[]> predCombo)
public static boolean updatePredCombo(int[] predInd, int[] bounds)
public static void updateLaterBits(int[] predInd, int[] bounds, int index, int predNum)
public static java.lang.String[] predCombo(java.lang.String[] predNames, int[] predInd)
public static int calculatePredComboNumber(int numPred, int highestInteractionTerms)
numPred:
- number of individual predictorshighestInteractionTerms:
- highest number of predictors allowed in generating interactionspublic static water.fvec.Frame[] buildTrainingFrames(water.Key<water.fvec.Frame> transformedCols, int numberOfModels, java.lang.String[][] transformedColNames, ANOVAGLMModel.ANOVAGLMParameters parms)
transformedCols:
- contains frame key of frame containing transformed columns of predictor A, predictor B,
interaction of predictor A and BnumberOfModels:
- number of models to build. For 2 factors, this should be 4.public static int[] oneIndexOut(int currIndex, int indexRange)
public static void fillModelMetrics(ANOVAGLMModel aModel, GLMModel glmModel, water.fvec.Frame trainingFrame)
aModel
- glmModel
- trainingFrame
- public static GLMModel[] extractGLMModels(GLM[] glmResults)
glmResults:
- array of GLM ModelBuilderspublic static void removeFromDKV(water.fvec.Frame[] trainingFrames, int numFrame2Delete)
public static water.fvec.Frame buildSpecificFrame(int[] predNums, water.fvec.Frame allCols, java.lang.String[][] transformedColNames, ANOVAGLMModel.ANOVAGLMParameters parms)
predNums:
- number of all predictor combosallCols:
- Frame containing all transformed columnstransformedColNames:
- transformed predictor combo arrays containing only predictor combos for a specific
training dataset. Recall that models are built with one predictor combo left out. This
is to generate that training frame with a specific predictor combo left out.parms:
- AnovaGLMParameterspublic static GLMModel.GLMParameters[] buildGLMParameters(water.fvec.Frame[] trainingFrames, ANOVAGLMModel.ANOVAGLMParameters parms)
public static double[] generateGLMSS(GLMModel[] glmModels, GLMModel.GLMParameters.Family family)
glmModels
- family
- public static GLM[] buildGLMBuilders(GLMModel.GLMParameters[] glmParams)
public static void generatePredictorNames(java.lang.String[][] predComboNames, java.lang.String[][] predictorNames, int[] predColumnStart, int[] degreeOfFreedom, DataInfo dinfo)
predComboNames:
- string array containing all predictor combos and for each combo, all the predictor names
involved in generating the interactions.predictorNames:
- string array containing all predictor names and for each combo, all the predictor names
* involved in generating the interactions.predColumnStart:
- column of each predictor combo after the frame transformation.degreeOfFreedom:
- degree of freedom for each predictor combodinfo
- public static int updateDOFColInfo(int predInd, java.lang.String[] predComboNames, int[] dof, int[] predCS, int offset)
public static int findComboMatch(java.lang.String[][] predComboNames, int currIndex)
public static java.lang.String[] combineAndFlat(java.lang.String[][] predictComboNames)
public static java.lang.String[] transformMultipleCols(water.fvec.Frame vec2Transform, java.lang.String[][] predComboNames, int currIndex, java.lang.String[][] predNames)
public static java.lang.String[] transformTwoCols(water.fvec.Frame vec2Transform, java.lang.String[] vecNames, java.lang.String[] lastComboNames)
vec2Transform:
- frame containing the two predictors to transformvecNames:
- name of the predictorslastComboNames:
- predictor combo names of the second vector if applicable. This is used to transform
more than two predictorspublic static java.lang.String[] transformOneCol(water.fvec.Frame vec2Transform, java.lang.String vecName)
vec2Transform:
- frame containing that one predictor to transform.vecName:
- name of predictorpublic static java.lang.String[] generateModelNames(java.lang.String[][] predictComboNames)