import org.apache.spark.sql.types.StructField
import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.type.NumericType
import org.apache.spark.sql.type.BooleanType
....
....
val TableSchema = Array(
("ID", IntegerType),
("Name", StringType),
("TNum", integerType),
("Handled", BooleanType),
("Value", StringType)
)
I have an array of schema information of a table and I am attempting to map it to a struct that can be used in the spark dataframe creation. The array after transformation should be as below:
val struct = Array(
StructField("ID", NumericType),
StructField("Name", BooleanType),
StructField("TNum", NumericType),
StructField("Handled", BooleanType),
StructField("Value", StringType))
So I am trying to write a method that convert each element to a StructField. This is my attempt:
def mapToStruct(arr:Array[(String, String, Object)])={
val newArr = arr.map(ele => StructField(ele._1, ele._2))
newArr
}
In this situation, I cannot get the class of StringType
, BooleanType
or IntegerType
from the third parameter of method mapToStruct. Exception I got is type mismatch; found : Object required: org.apache.spark.sql.types.DataType
. But if I change the parameter type to Array[(String, String, DataType)], it does not match the variable type.
My question is what datatype I should choose for the third parameter of method mapToStruct and then I can get the class of this object at run time.
thanks in advance.
Aucun commentaire:
Enregistrer un commentaire