Pandas: Initializing from file failed

Created on 9 Jul 2018 · 12Comments · Source: pandas-dev/pandas

Kindly, share your insights regarding the following error

  File "source\matrix_gen.py", line 68, in <module>
    main()
  File "source\matrix_gen.py", line 22, in main
    data = pd.read_csv(sys.argv[1], parse_dates=True, dayfirst=True)# argv[1]: stock_symbol.txt
  File "C:\Users\aims\Anaconda3\envs\py35\lib\site-packages\pandas\io\parsers.py", line 678, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\aims\Anaconda3\envs\py35\lib\site-packages\pandas\io\parsers.py", line 440, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "C:\Users\aims\Anaconda3\envs\py35\lib\site-packages\pandas\io\parsers.py", line 787, in __init__
    self._make_engine(self.engine)
  File "C:\Users\aims\Anaconda3\envs\py35\lib\site-packages\pandas\io\parsers.py", line 1014, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "C:\Users\aims\Anaconda3\envs\py35\lib\site-packages\pandas\io\parsers.py", line 1708, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas\_libs\parsers.pyx", line 384, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas\_libs\parsers.pyx", line 697, in pandas._libs.parsers.TextReader._setup_parser_source
OSError: Initializing from file failed

Duplicate

Source

JafferWilson

Most helpful comment

I ran into a similar issue with a Jupyter notebook downloaded from the internet with an accompanying CSV. Turned out the CSV had no read permissions--sigh--and Pandas didn't give a hint as to that underlying cause. Not sure if this was the same thing happening to you @JafferWilson--I forget how permissions work on Windows--but figured I post this in case anyone else hits a similar issue.

breckuh on 11 Sep 2018

👍3

All 12 comments

Could you provide a minimally reproducible example?

http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

mroeschke on 9 Jul 2018

sure. Here you can see. I am trying to run this repository: https://github.com/hardyqr/CNN-for-Stock-Market-Prediction-PyTorch
From this, I have tried running the file: https://github.com/hardyqr/CNN-for-Stock-Market-Prediction-PyTorch/blob/master/source/matrix_gen.py
The csv samples I have downloaded from Kaggle sources as mentioned in the repository's READ.ME file.
I hope the information required is fulfilled.
Please let me know if anything more needed.

JafferWilson on 10 Jul 2018

From your traceback, your issue looks to be when reading in the file:

data = pd.read_csv(sys.argv[1], parse_dates=True, dayfirst=True)# argv[1]: stock_symbol.txt

You may want to make sure that the path to the file you are passing in is fully specified.

mroeschke on 10 Jul 2018

Yes, thats true. I have specified everything properly and the path I see is correct but the issue persists.

JafferWilson on 10 Jul 2018

Can you provide the details of pd.show_versions()?

This _may_ be a duplicate of #15086, but it's hard to tell since your example is not a MCVE

mroeschke on 10 Jul 2018

This is the output I received after running the command:

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.0.final.0
python-bits: 64
OS: Windows
OS-release:
machine: AMD64
processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.22.0
pytest: None
pip: 10.0.1
setuptools: 39.2.0
Cython: 0.28.2
numpy: 1.14.5
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

JafferWilson on 11 Jul 2018

Based on your configuration and similar traceback, this may be an issue reading files with Windows as described in #15086. Tentatively closing as a duplicate of that open issue.

Looks like there are some suggestions to resolve that issue; a partial workaround for now seems to be here: https://github.com/pandas-dev/pandas/issues/15086#issuecomment-357481412

mroeschke on 11 Jul 2018

I had gone through the comments and suggestions from this blog my friend. But none is helpful. But if there is no solution then I can say anything. Thank you for closing the issue and marking duplicate.

JafferWilson on 11 Jul 2018

If you feel that your issue is different from #15086, feel free to reopen this thread with a representative example. We hope that issue can be resolved soon from a community contribution.

Otherwise, you can also try using StackOverflow.

mroeschke on 11 Jul 2018

@mroeschke Thank you. I will try it on my own this time as there was no reply on the same from Stackoverflow. I tried it already. Otherwise I guess that I will have to move on to Linux instead of Windows.

JafferWilson on 11 Jul 2018

breckuh on 11 Sep 2018

👍3

txtPath = 'E:/Defensive/textlog'
txtLists = os.listdir(txtPath)
# txtLists = os.path.isdir(txtPath)
for txt in txtLists:
file_name = txtPath + "/" + txt
reader = pd.read_table(file_name, sep=",", names=[i for i in range(4)], iterator=True)

error：OSError: Initializing from file failed
why?