Skip to content

List papers citing a paper #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ckreibich opened this issue Feb 17, 2014 · 12 comments
Open

List papers citing a paper #5

ckreibich opened this issue Feb 17, 2014 · 12 comments
Labels

Comments

@ckreibich
Copy link
Owner

Xavi Anguera has suggested making the list of papers citing a paper queryable via the API. This needs a bit more thinking about the notion of paper identity (cluster ID) vs presentation to the user, but shouldn't be a big problem otherwise.

@ckreibich ckreibich added the wish label Feb 17, 2014
@rpgoldman
Copy link

I would love to see this, too. I need to analyze the set of citations to a particular article, and it is very painful to do this manually, screenful by screenful.

@jooweg
Copy link

jooweg commented May 7, 2014

Me too, this would make scholar.py an incredibly powerful tool!

On 07 May 2014, at 17:00, rpgoldman [email protected] wrote:

I would love to see this, too. I need to analyze the set of citations to a particular article, and it is very painful to do this manually, screenful by screenful.


Reply to this email directly or view it on GitHub.

@arcolife
Copy link

arcolife commented May 7, 2014

👍 Thing is how do we counter the API query limits? I sometimes wish Google provided a free access to it's repository, just like arXiv. Is setting up tor a good idea?

@rpgoldman
Copy link

I'm willing to live with reasonable throttling. E.g., 250 articles (I
just pulled by hand) is a horrible nuisance by me, but probably in the
noise for Google, especially if I do it once in a blue moon.

Archit Sharma wrote:

👍 Thing is how do we counter the API query limits? I sometimes wish
Google provided a free access to it's repository, just like arXiv. Is
setting up tor a good idea?


Reply to this email directly or view it on GitHub
#5 (comment).

@ckreibich
Copy link
Owner Author

Duly noted, folks! Support for this is on the way.

@arcolife, Tor will help you little regarding query limits; in fact, given that it's easy to identify Tor exits it might actually make things worse for you. The only real help will be distributed clients, but you'll have to build that botnet yourself. :)

@arcolife
Copy link

arcolife commented May 7, 2014

@ckreibich I see! 💩

Btw I've been building a recommendation engine for Research papers. It would be nice to have this feature, as it would add on to the currently available sources. I'm willing to contribute! :)

@chendaniely
Copy link

I can take on this issue. Need a little guidance and help with the existing though though.
If I am understanding the problem:

You can get the link to the citing papers by accessing the url_citations attribute.
Eg:

querier.articles[0]attrs.get('url_citations')[0]

should return something like u'http://scholar.google.com/scholar?cites=5556531000720111691&as_sdt=2005&sciodt=0,5&hl=en'

And since we have a new search result, the goal is to parse this page into individual articles?

@rpgoldman
Copy link

If it helps: this is how I stumbled on this issue: I was writing an article, and wanted to claim that the literature on topic X did not contain any article that addressed issue I.

Google Scholar had the right information to do this, but it was very painful to extract that information. I had to scroll through pages and pages of articles, moving from page to page interactively. And there was no way to check this claim for correctness over time. I.e., if I reran the query, I had no obvious way to check to see if the results were the same, or if new papers had appeared.

I was hoping to be able to automate this process at least somewhat.

@chendaniely
Copy link

I can put up an ipython notebook with a working example of my extension module that implements this. @rpgoldman I'm pretty much in the same boat as you. Once I finish up implementing my extension I can see what would be the best way to fold the code in.

#10 seems to address the problem, but it's not merged, and I'm not 100% sure if it's doing what I want at the moment.

@marianormuro
Copy link

Hi guys, are there any news on this issue? Im about to implement the same, @chendaniely did you implement something on this?

@chendaniely
Copy link

hey @marianormuro sorry for the really late reply.
My implementation is really hacky, janky, and untested. Probably shouldn't really use it for 'serious' work

@pesho-ivanov
Copy link

#83 seems to have solved the discussed issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants