Interface Word2VecBase

All Superinterfaces:
HasInputCol, HasMaxIter, HasOutputCol, HasSeed, HasStepSize, Identifiable, Params, Serializable
All Known Implementing Classes:
Word2Vec, Word2VecModel

public interface Word2VecBase extends Params, HasInputCol, HasOutputCol, HasMaxIter, HasStepSize, HasSeed
Params for Word2Vec and Word2VecModel.
  • Method Details

    • getMaxSentenceLength

      int getMaxSentenceLength()
    • getMinCount

      int getMinCount()
    • getNumPartitions

      int getNumPartitions()
    • getVectorSize

      int getVectorSize()
    • getWindowSize

      int getWindowSize()
    • maxSentenceLength

      IntParam maxSentenceLength()
      Sets the maximum length (in words) of each sentence in the input data. Any sentence longer than this threshold will be divided into chunks of up to maxSentenceLength size. Default: 1000
      Returns:
      (undocumented)
    • minCount

      IntParam minCount()
      The minimum number of times a token must appear to be included in the word2vec model's vocabulary. Default: 5
      Returns:
      (undocumented)
    • numPartitions

      IntParam numPartitions()
      Number of partitions for sentences of words. Default: 1
      Returns:
      (undocumented)
    • validateAndTransformSchema

      StructType validateAndTransformSchema(StructType schema)
      Validate and transform the input schema.
      Parameters:
      schema - (undocumented)
      Returns:
      (undocumented)
    • vectorSize

      IntParam vectorSize()
      The dimension of the code that you want to transform from words. Default: 100
      Returns:
      (undocumented)
    • windowSize

      IntParam windowSize()
      The window size (context words from [-window, window]). Default: 5
      Returns:
      (undocumented)