Getting this error only when run through spark-submit.sh with Apache Spark, this runs OK in IntelliJ with a normal run config so I am pretty convinced this is something about how Spark wants to access the constructor, which we want private:
Class org.apache.spark.deploy.SparkSubmit$ can not access a member of class jpsgcs.thold.AnyOldClass with modifiers "public static"
Here's the MVCE:
import java.io.IOException;
import java.io.Serializable;
class AnyOldClass implements Serializable {
public String anyOldString = null;
private AnyOldClass() throws IOException {
anyOldString = new String("hello dere");
}
public static void main(String[] args) throws Exception {
AnyOldClass anyOldInstance = new AnyOldClass();
anyOldInstance.go();
}
private void go() {
System.out.println("Visualize ");
}
}
Happy to post the full version of the error if needed, and this MVCE started as a full functioning Spark-based program that worked fine before we made the constructor private. The lack of SparkContext and SparkConf is not the issue.
We have this class that inherits through a few levels. To make an RDD of that class, we had to go about five levels up the inheritance chain, making all those levels Serializable. It's gonna get ugly to serialize such a deep stack, right? (That's before we even try Kryo)
We think it's a better approach to run one worker per core, resulting in one JVM per core. In each JVM we want to get one instance of this class. Then we'll use another class we parallelize in a JavaRDD to modify the contents of this class, one partition per core/JVM, with the each element of each partition of the RDD-based class modifying this class in place.
Aucun commentaire:
Enregistrer un commentaire