Runtime configuration interface for this SparkSession.
The RuntimeConfig instance for getting/setting Spark configuration
Returns the session ID for this SparkSession.
The unique session identifier
Executes a SQL query and returns the result as a DataFrame.
The SQL query string to execute
A promise that resolves to a DataFrame containing the query results
Creates a DataFrame from an array of Row objects with a specified schema.
An array of Row objects
The schema defining the structure of the DataFrame
A new DataFrame containing the provided data
const schema = new StructType([
new StructField("name", DataTypes.StringType),
new StructField("age", DataTypes.IntegerType)
]);
const data = [
new Row(schema, { name: "Alice", age: 30 }),
new Row(schema, { name: "Bob", age: 25 })
];
const df = spark.createDataFrame(data, schema);
await df.show();
Creates a DataFrame from an Apache Arrow Table.
An Apache Arrow Table object
The Spark schema defining the structure of the DataFrame
A new DataFrame containing the Arrow table data
Creates a DataFrame with a single column of Long values starting from 0 to end (exclusive).
The end value (exclusive)
A DataFrame with a single column named "id" containing values from 0 to end-1
Creates a DataFrame with a single column of Long values.
The start value (inclusive)
The end value (exclusive)
A DataFrame with a single column named "id" containing values from start to end-1
Creates a DataFrame with a single column of Long values with a custom step.
The start value (inclusive)
The end value (exclusive)
The increment between consecutive values
A DataFrame with a single column named "id"
Creates a DataFrame with a single column of Long values with custom step and partitions.
The start value (inclusive)
The end value (exclusive)
The increment between consecutive values
The number of partitions for the resulting DataFrame
A DataFrame with a single column named "id"
Returns a DataFrameReader for reading data from external sources.
A DataFrameReader instance
Returns the specified table as a DataFrame.
The name of the table to load
A DataFrame representing the table
The entry point to programming Spark with the Dataset and DataFrame API.
Remarks
A SparkSession can be used to create DataFrames, register DataFrames as tables, execute SQL over tables, cache tables, and read data from various sources. A SparkSession is created using the SparkSession.builder method.
The SparkSession provides access to:
Example
Since
1.0.0