ArrayType#

class pyspark.sql.types.ArrayType(elementType, containsNull=True)[source]#

Array data type.

Parameters
elementTypeDataType

DataType of each element in the array.

containsNullbool, optional

whether the array can contain null (None) values.

Examples

>>> from pyspark.sql.types import ArrayType, StringType, StructField, StructType

The below example demonstrates how to create class:ArrayType:

>>> arr = ArrayType(StringType())

The array can contain null (None) values by default:

>>> ArrayType(StringType()) == ArrayType(StringType(), True)
True
>>> ArrayType(StringType(), False) == ArrayType(StringType())
False

Methods

fromDDL(ddl)

Creates DataType for a given DDL-formatted string.

fromInternal(obj)

Converts an internal SQL object into a native Python object.

fromJson(json[, fieldPath, collationsMap])

json()

jsonValue()

needConversion()

Does this type needs conversion between Python object and internal SQL object.

simpleString()

toInternal(obj)

Converts a Python object into an internal SQL object.

toNullable()

Returns the same data type but set all nullability fields are true (StructField.nullable, ArrayType.containsNull, and MapType.valueContainsNull).

typeName()

Methods Documentation

classmethod fromDDL(ddl)#

Creates DataType for a given DDL-formatted string.

New in version 4.0.0.

Parameters
ddlstr

DDL-formatted string representation of types, e.g. pyspark.sql.types.DataType.simpleString, except that top level struct type can omit the struct<> for the compatibility reason with spark.createDataFrame and Python UDFs.

Returns
DataType

Examples

Create a StructType by the corresponding DDL formatted string.

>>> from pyspark.sql.types import DataType
>>> DataType.fromDDL("b string, a int")
StructType([StructField('b', StringType(), True), StructField('a', IntegerType(), True)])

Create a single DataType by the corresponding DDL formatted string.

>>> DataType.fromDDL("decimal(10,10)")
DecimalType(10,10)

Create a StructType by the legacy string format.

>>> DataType.fromDDL("b: string, a: int")
StructType([StructField('b', StringType(), True), StructField('a', IntegerType(), True)])
fromInternal(obj)[source]#

Converts an internal SQL object into a native Python object.

classmethod fromJson(json, fieldPath='', collationsMap=None)[source]#
json()#
jsonValue()[source]#
needConversion()[source]#

Does this type needs conversion between Python object and internal SQL object.

This is used to avoid the unnecessary conversion for ArrayType/MapType/StructType.

simpleString()[source]#
toInternal(obj)[source]#

Converts a Python object into an internal SQL object.

toNullable()[source]#

Returns the same data type but set all nullability fields are true (StructField.nullable, ArrayType.containsNull, and MapType.valueContainsNull).

New in version 4.0.0.

Returns
ArrayType

Examples

Example 1: Simple nullability conversion

>>> ArrayType(IntegerType(), containsNull=False).toNullable()
ArrayType(IntegerType(), True)

Example 2: Nested nullability conversion

>>> ArrayType(
...     StructType([
...         StructField("b", IntegerType(), nullable=False),
...         StructField("c", ArrayType(IntegerType(), containsNull=False))
...     ]),
...     containsNull=False
... ).toNullable()
ArrayType(StructType([StructField('b', IntegerType(), True),
StructField('c', ArrayType(IntegerType(), True), True)]), True)
classmethod typeName()#