I am trying to make design metadata driven data pipelines, so I define in external textual metadata (json, yaml) things like:
dataFunctionSequence = DataFunctionSequence(
functions=[
Function(
functionClass="Function1",
functionPackage="pipelines.experiments.meta_driven.function_1",
parameters=[
Parameter(name="param1", dataType="str", value="this is my function str value")
]
)
]
)
Now with help of importlib i can get class:
functionMainClass = getattr(
importlib.import_module(
functionMeta.functionPackage), functionMeta.functionClass)
But I want real instance, so I got it working with this piece of code which utilizes on top of importlib as well eval builtin function:
def create_instance(class_str:str):
"""
Create a class instance from a full path to a class constructor
:param class_str: module name plus '.' plus class name and optional parens with arguments for the class's
__init__() method. For example, "a.b.ClassB.ClassB('World')"
:return: an instance of the class specified.
"""
try:
if "(" in class_str:
full_class_name, args = class_name = class_str.rsplit('(', 1)
args = '(' + args
else:
full_class_name = class_str
args = ()
# Get the class object
module_path, _, class_name = full_class_name.rpartition('.')
mod = importlib.import_module(module_path)
klazz = getattr(mod, class_name)
# Alias the the class so its constructor can be called, see the following link.
# See https://www.programiz.com/python-programming/methods/built-in/eval
alias = class_name + "Alias"
instance = eval(alias + args, { alias: klazz})
return instance
except (ImportError, AttributeError) as e:
raise ImportError(class_str)
And I can construct a string that will construct the class and it works like charm.
Now my problem is that the class requires another parameter which is complex Spark DataFrame object which is not loaded from metadata but from some database or s3 bucket for example. Here I fail to be able to create dynamically instance with non-string variable.
I am failing here:
instance = eval(alias + args, { alias: klazz})
I tried to extend the create_instance() fnc with **kwargs, so I can dynamically search for parameter by name eg kwargs["dataFrame"], but how to assign it dynamically to init?
evel is not the right way probably or my expression is not correct?
NOTE: Another possible approach was to iterate somehoe over init object where I will get all params of constructor, but still I don't know what python reflection fuction to use to make a real instance.
Workaround: What I can do is that I can simply remove the dataFrame from class constructor, create instance only on simple params (eg. strings) and call method: instance.setDataFrame(dataFrame)
And it will work. but I wanted some reflection base approach if possible.
Thank you very much.
Ladislav
Aucun commentaire:
Enregistrer un commentaire