dimanche 30 octobre 2016

Get class from Object in the run time in scala

import org.apache.spark.sql.types.StructField
import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.type.NumericType
import org.apache.spark.sql.type.BooleanType
   ....
   ....
val TableSchema = Array(
      ("ID", IntegerType),
      ("Name", StringType),
      ("TNum", integerType),
      ("Handled", BooleanType),
      ("Value", StringType)
      )

I have an array of schema information of a table and I am attempting to map it to a struct that can be used in the spark dataframe creation. The array after transformation should be as below:

val struct = Array(
 StructField("ID", NumericType),
 StructField("Name", BooleanType),
 StructField("TNum", NumericType),
 StructField("Handled", BooleanType),
 StructField("Value", StringType))

So I am trying to write a method that convert each element to a StructField. This is my attempt:

    def mapToStruct(arr:Array[(String, String, Object)])={
     val newArr = arr.map(ele => StructField(ele._1, ele._2))
     newArr
   }

In this situation, I cannot get the class of StringType, BooleanType or IntegerType from the third parameter of method mapToStruct. Exception I got is type mismatch; found : Object required: org.apache.spark.sql.types.DataType. But if I change the parameter type to Array[(String, String, DataType)], it does not match the variable type.

My question is what datatype I should choose for the third parameter of method mapToStruct and then I can get the class of this object at run time.
thanks in advance.





Aucun commentaire:

Enregistrer un commentaire