The “h2o.splitFrame” Function in R

  • Package: h2o

  • Purpose: Splits an H2OFrame object into train and test sets.

  • General class: Data manipulation

  • Required argument(s):

    • data: The H2OFrame object to be split.

    • ratios: A numeric vector specifying the ratios for splitting the data.

  • Notable optional arguments:

    • destination_frames: A character vector specifying the names of the resulting frames.

    • seed: An integer specifying the random seed for reproducibility.

  • Example:

  • # Load the h2o library
    library(h2o)

    # Initialize h2o
    h2o.init()

    # Import data
    data <- h2o.importFile("https://s3.amazonaws.com/h2o-public-test-data/smalldata/iris/iris_wheader.csv")

    # Split the data into train and test sets
    split <- h2o.splitFrame(data, ratios = 0.8, seed = 123)
    train <- h2o.assign(split[[1]], "train")
    test <- h2o.assign(split[[2]], "test")

    # View the dimensions of the train and test sets
    h2o.dim(train)
    h2o.dim(test)

  • This example demonstrates how to split an H2OFrame object into train and test sets using the h2o.splitFrame function from the h2o package. The function takes the data to be split (data) and the ratios for splitting (ratios). Optional arguments like destination_frames and seed can also be specified for customized splitting. Finally, the h2o.dim function is used to view the dimensions of the train and test sets.

Previous
Previous

The “h2o.shutdown” Function in R

Next
Next

The “h2o.randomForest” Function in R