Pretty stumped on this one... any pythonistas know...
# daft-dev
d
Pretty stumped on this one... any pythonistas know what's the difference between
getattr()
in python 3.8 vs 3.10? With the python 3.8 CI, I'm seeing this line fail
Copy code
if isinstance(array.type, getattr(pa, "FixedShapeTensorType", ())):
E   TypeError: isinstance() arg 2 must be a type or tuple of types
where
pa
here is an object
pa = LazyImport("pyarrow")
with the
LazyImport
class having the following definition:
Copy code
class LazyImport:
    """Lazy importer
    There are certain large imports (e.g. Ray, daft.unity_catalog.UnityCatalogTable, etc.) that
    do not need to be top-level imports. For example, Ray should only be imported when the ray
    runner is used, or specific ray data extension types are needed. We can lazily import these
    modules as needed.
    """

    def __init__(self, module_name: str):
        self._module_name = module_name
        self._module = None

    def module_available(self):
        return self._load_module() is not None

    def _load_module(self):
        if self._module is None:
            try:
                self._module = importlib.import_module(self._module_name)
            except ImportError:
                assert False
                pass
        return self._module

    def __getattr__(self, name: str) -> Any:
        # Attempt to access the attribute, if it fails, assume it's a submodule and lazily import it
        try:
            if name in self.__dict__:
                return self.__dict__[name]
            return getattr(self._load_module(), name)
        except AttributeError:
            # Dynamically create a new LazyImport instance for the submodule
            submodule_name = f"{self._module_name}.{name}"
            lazy_submodule = LazyImport(submodule_name)
            setattr(self, name, lazy_submodule)
            return lazy_submodule
This works fine in python 3.10...
s
Can you print the version of pyarrow for both?
💡 1
c
kinda unrelated, but isn't 3.8 EOL in a few weeks?
😬 2
s
Yeah, but some of biggest users are on 3.8 iirc
j
FWIW works for me locally (Python 3.8.19):
Accessing a module that doesn’t exist gives a weird assertion error though, rather than correctly defaulting to empty tuple and running the `isinstance`:
d
Ahhh I think that can be fixed.
LazyImport
currently assumes missing attributes are submodules, but this is only true if the parent module was successfully imported. Fixing that
👍 1
j
Also you can use something like this to get into the CI machines: https://github.com/marketplace/actions/debugging-with-ssh Might make debugging this easier if you can just hop in
💪 1
d
Ah turns out it really was a pyarrow version issue. I trusted pip freeze which told me that pyarrow 17 was being used, but didnt realize uv was overriding and using pyarrow 7 instead
🤯 1