Since 2.0.0, pyarrow.hdfs.connect is deprecated, as seen on this GHA log, and recommends to use pyarrow.fs.HadoopFileSystem.
home/runner/work/dvc/dvc/dvc/tree/pool.py:53: DeprecationWarning: pyarrow.hdfs.connect is deprecated as of 2.0.0, please use pyarrow.fs.HadoopFileSystem instead.
return self._conn_func(*self._conn_args, **self._conn_kwargs)
Here are the usages:
https://github.com/iterative/dvc/blob/e1b82c5222930c55886ca16a48c3a223d05b4af0/dvc/tree/hdfs.py#L51-L56
https://github.com/iterative/dvc/blob/ca2e6a004cb6268c7b62f79ba22b55c5b81d8984/tests/remotes/hdfs.py#L20
https://github.com/iterative/dvc/blob/ca2e6a004cb6268c7b62f79ba22b55c5b81d8984/tests/remotes/hdfs.py#L128
If no one's working on it, I'd like to try doing this issue.
@jwokaty Sure, please feel free to give it a shot. Thanks for looking into it! :pray:
@efiop fixed at #4936
@kgritesh, it would have been better if you'd comment here before creating a PR, given @jwokaty is interested in working on this.
@skshetry / @jwokaty sorry for this. I somehow missed the comment that someone else has picked it up.
Most helpful comment
@kgritesh, it would have been better if you'd comment here before creating a PR, given @jwokaty is interested in working on this.