Skip to content

0.15.2 causing problems with pandas.io.data.Options #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aisthesis opened this issue Mar 22, 2015 · 9 comments · Fixed by #24
Closed

0.15.2 causing problems with pandas.io.data.Options #22

aisthesis opened this issue Mar 22, 2015 · 9 comments · Fixed by #24

Comments

@aisthesis
Copy link

I finally traced a problem I was having with options downloads to changes made between version 0.15.1 and version 0.15.2. Probably easiest is just to link the question I posed on Stack Overflow, because it shows the behavior: http://stackoverflow.com/questions/29182526/trouble-with-http-request-from-google-compute-engine

Weirdly, in 0.15.2, I was consistently able to get the options data for large cap companies ('aapl', 'ge' were my typical test cases) but not for small cap companies such as 'spwr' or 'ddd'. Not sure what was changed, but it looks to me like it might have to do with the list of expiration dates or with the handling of empty tables given an expiration date. Right now, in any case, if you hit the link shown in my stack trace (http://finance.yahoo.com/q/op?s=SPWR&date=1430438400), there's an empty table for puts and only 1 call. That would be something that's more common for smaller companies, too. The other possibility is that the initial Options object isn't getting good links in the newer version.

That's about all I know about it, but reverting to 0.15.1 seems to have solved the problems I was having.

@davidastephens
Copy link
Member

Thanks. I'll take a look at this issue. Likely just need to add a check for empty tables.

davidastephens added a commit to davidastephens/pandas-datareader that referenced this issue Mar 25, 2015
davidastephens added a commit to davidastephens/pandas-datareader that referenced this issue Mar 25, 2015
@davidastephens
Copy link
Member

#24 should fix it. Please reopen if you still have issues.

@aisthesis
Copy link
Author

Is this set up to get built with each new pandas release? Or is pandas-datareader intended to become a completely separate package? If pandas-datareader does get merged into pandas, how long do you expect it to take for a new pandas release to include your fix?

@jorisvandenbossche
Copy link
Member

It is intended that pandas-datareader can have it's own release schedule as a separate package, so independent of pandas itself. So the io.data module in pandas will just depend on the version you have installed of pandas-datareader

@aisthesis
Copy link
Author

I'm not sure I follow you here. In various spots (pypi, README at root of pandas-datareader), it says that to use pandas-datareader you first have to install it, then in code you import using from pandas_datareader import data, wb. So, presumably where I have in my code from pandas.io.data import Options, I should write from pandas-datareader.data import Options. Then the name Options is going to be the class defined in whatever version I have of pandas-datareader.

But are you saying this: Supposing I have pandas 0.5.1 installed and pandas-datareader 0.1 then when I write from pandas-datareader.data import Options, the class definition in pandas-datareader 0.1 will be the one used, overriding the one in pandas 0.5.1? That doesn't seem like it should be the case, but maybe I'm missing something.

Also, what if you don't have pandas-datareader installed? Will future releases of pandas pick up the changes in pandas-datareader with some kind of lag? Or is pandas.io.data just frozen in its current state? If pandas.io.data isn't frozen, roughly how long will it take for a new release of pandas to have the latest changes in pandas-datareader?

@jorisvandenbossche
Copy link
Member

We are still in the process of migrating it to a separate package, so that's why it is probably not yet that clear everywhere.
But indeed, the intention is that, to keep backwards compatibility, when you import it from pandas, you will get the version of pandas-datareader (which is then a dependency for pandas if you want to use these functions). But the version in pandas-datareader will, to start with, the same as now in pandas, and in the future receive the bugfixes/improvements. And after a transition period, the code in pandas will be removed and you will have to have pandas-datareader installed.

@aisthesis
Copy link
Author

Verified the fix for both Python 2.7 and Python 3.4 . Thanks!

@aisthesis
Copy link
Author

It looks like this issue needs to be re-opened. Using pandas 0.15.1 I'm repeatedly getting a successful pull where I'm getting a failure using pandas_datareader 0.1.0:

>>> import pandas as pd
>>> pd.__version__
'0.15.1'
>>> from pandas.io.data import Options
>>> spwr = Options('spwr', 'yahoo')
>>> sopt = spwr.get_all_data()
>>>

But at the same time in a virtual environment using pandas 0.16.0 and pandas_datareader 0.1.0 this operation fails with similar message to what I was getting before:

>>> import pandas_datareader as pdr
>>> pdr.__version__
'0.1.0'
>>> from pandas_datareader.data import Options
>>> spwr = Options('spwr', 'yahoo')
>>> sopt = spwr.get_all_data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/marshallfarrier/venv/pnlocal/lib/python2.7/site-packages/pandas_datareader/data.py", line 1118, in get_all_data
    return self._get_data_in_date_range(dates=expiry_dates, call=call, put=put)
  File "/Users/marshallfarrier/venv/pnlocal/lib/python2.7/site-packages/pandas_datareader/data.py", line 1132, in _get_data_in_date_range
    frame = self._get_option_data(expiry=expiry_date, name=name)
  File "/Users/marshallfarrier/venv/pnlocal/lib/python2.7/site-packages/pandas_datareader/data.py", line 751, in _get_option_data
    frames = self._get_option_frames_from_yahoo(expiry)
  File "/Users/marshallfarrier/venv/pnlocal/lib/python2.7/site-packages/pandas_datareader/data.py", line 668, in _get_option_frames_from_yahoo
    option_frames = self._option_frames_from_url(url)
  File "/Users/marshallfarrier/venv/pnlocal/lib/python2.7/site-packages/pandas_datareader/data.py", line 705, in _option_frames_from_url
    raise RemoteDataError('Received no data from Yahoo at url: %s' % url)
pandas_datareader.data.RemoteDataError: Received no data from Yahoo at url: http://finance.yahoo.com/q/op?s=SPWR&date=1431043200
>>>

As before, at the time of the function call, there is no data in the calls table when you pull up the link given in the error message.

Please fix this for real, so I can unfreeze my pandas version for the dependent library I'm maintaining!

@davidastephens
Copy link
Member

Its fixed in the master version of both pandas and pandas-datareader.

I'll do a new release of pandas-datareader so that it will be in the pip download.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants