Skip to content

Add coords argument to pymc.set_data #5588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 24, 2022
Merged

Add coords argument to pymc.set_data #5588

merged 5 commits into from
Apr 24, 2022

Conversation

soma2000-lang
Copy link
Contributor

Fixes #5547

@soma2000-lang
Copy link
Contributor Author

@OriolAbril Please have a look into this

pymc/model.py Outdated
@@ -1728,7 +1728,7 @@ def point_logps(self, point=None, round_vals=2):
Model._context_class = Model


def set_data(new_data, model=None):
def set_data(new_data, coords,model=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change the signature

Suggested change
def set_data(new_data, coords,model=None):
def set_data(new_data, model=None, *, coords=None):

Also please make sure to run pre-commit run --all before uploading to avoid unnecessary CI runs due to formatting problems.

pymc/model.py Outdated
@@ -1766,7 +1766,7 @@ def set_data(new_data, model=None):

for variable_name, new_value in new_data.items():
model.set_data(variable_name, new_value)

return model.set_data(new_data, coords)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, you'll need to edit the previous line.

Also make sure to add a test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok i will do this changes at the earliest

@soma2000-lang soma2000-lang changed the title Aadd coords argument to pymc.set_data Add coords argument to pymc.set_data Mar 12, 2022
@soma2000-lang
Copy link
Contributor Author

@michaelosthege Sorry for the silly question but can you please tell specifically what test are you talking about

@ricardoV94
Copy link
Member

ricardoV94 commented Mar 12, 2022

@michaelosthege Sorry for the silly question but can you please tell specifically what test are you talking about

When you change the behavior of a function or fix a bug you should add a test that shows the new behavior is the expected one. This also ensures future changes to the codebase won't reintroduce the same bugs again.

@soma2000-lang
Copy link
Contributor Author

@ricardoV94 So should I write the test for this , in the tests folder of this repo??

@michaelosthege
Copy link
Member

@ricardoV94 So should I write the test for this , in the tests folder of this repo??

Use the search function (Ctrl+Shift+F in VSCode) to find occurrences of set_data(. Some of those will be in the test suite (probably test_model.py) and you should put the new test case right next to the existing ones.

@soma2000-lang
Copy link
Contributor Author

Ok

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @soma2000-lang. I'll try to be as detailed as possible in the review, I hope it helps. As always, don't hesitate to ask questions here and on discourse. We do not know what your background is nor how much experience you have using pymc, using python, using pytest... so we can only assume. We might be overly explicit or overly cryptic and can only improve that from interacting and if you ask questions and give detailed answers of what you have tried and what you don't understand.

This PR is adding new functionality to the PyMC library. Therefore, as you saw in the checklists of the PR template it needs to also add tests covering this new functionality. You can remove items from the checklist when they don't apply to your PR but you should only do so if you are sure they do not apply to your case. Otherwise you should keep them, especially when starting to contribute, they are thought to help both you in setting the PR and to help reviewers make sure we don't miss anything.

You can see for example in #4759 which I submitted a while ago how I removed some of the checklist points because that PR did changes to the documentation only and introduced no new functionality. Unlike here, the test check did not apply to that PR.

In this particular case, the test should be added somewhere in https://github.com/pymc-devs/pymc/blob/main/pymc/tests/test_data_container.py, where you will see there are already other tests checking the behaviour of both the set_data function and the set_data method. The first step is making sure you understand how both work and what is the change you are introducing. Once you do you should generate an example that doesn't work currently on pymc and will work on your PR thanks to your changes. If you have never used pymc.set_data let us know and we'll search some examples.

I will also leave a comment on the issue about collaborating and avoiding duplicated work, especially now that we are working on the next major release and there are also many prospective gsoc applicants contributing.

pymc/model.py Outdated
@@ -1765,8 +1765,8 @@ def set_data(new_data, model=None):
model = modelcontext(model)

for variable_name, new_value in new_data.items():
model.set_data(variable_name, new_value)

model.set_data(variable_name, new_value,coords)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
model.set_data(variable_name, new_value,coords)
model.set_data(variable_name, new_value, coords=coords)

Here I think it would be best to use the argument as keyword instead of positional, but that should not change the output.

pymc/model.py Outdated
model.set_data(variable_name, new_value)

model.set_data(variable_name, new_value,coords)
"""return model.set_data(new_data, coords)"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be removed. This function modifies the data stored in the model class and does not have and should not have any return function. If you look at the code of pymc.Model.set_data you'll see that it doesn't have any return value either.

Adding this as a string makes the code run because this line is now ignored instead of being executed, so it should be removed to avoid adding code that is never executed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would definitely do the required changes and keep in mind what ever you have said

@OriolAbril
Copy link
Member

Do you need any help with tests @soma2000-lang ?

@soma2000-lang
Copy link
Contributor Author

soma2000-lang commented Mar 27, 2022

@OriolAbril
Actually yes
Sorry for the very late reply But I was going through unit testing in general
: But I am kinda confused how to do test for large codespace like this
Can you give me some references or examples that would be great

@OriolAbril
Copy link
Member

Have you gone over the tests in test_data_container.py file that I linked in the previous comment? Specifically the ones using set_data (in either two of its forms)? Do you understand what they are doing?

@soma2000-lang
Copy link
Contributor Author

soma2000-lang commented Mar 29, 2022

@OriolAbril I had went over the tests ,its bascically testing with some sample data and checking it with against some cases

@OriolAbril
Copy link
Member

Do you think you can use these tests as a template and write a new one that covers the use of coords in pymc.set_data?

@soma2000-lang
Copy link
Contributor Author

@OriolAbril Yes I will try doing that.

@OriolAbril
Copy link
Member

Any news? Have you been able to try that? If so where are you stuck?

I am happy to help but can't do anything if I don't know where and how I'm supposed to do it. I'd like to use this in pymc-devs/pymc-examples#285 and will probably take over in ~2 weeks if still pending.

@soma2000-lang
Copy link
Contributor Author

soma2000-lang commented Apr 11, 2022 via email

@soma2000-lang
Copy link
Contributor Author

@OriolAbril

@OriolAbril
Copy link
Member

Sorry for the inconvinience due to ongoing University examination I was a bit busy and unable to
work on it but I will resume soon

Great, looking forward to the tests

@codecov
Copy link

codecov bot commented Apr 23, 2022

Codecov Report

Merging #5588 (654418b) into main (162ad6c) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5588      +/-   ##
==========================================
+ Coverage   88.90%   88.92%   +0.01%     
==========================================
  Files          75       75              
  Lines       13738    13738              
==========================================
+ Hits        12214    12216       +2     
+ Misses       1524     1522       -2     
Impacted Files Coverage Δ
pymc/model.py 86.63% <100.00%> (+0.27%) ⬆️

@michaelosthege michaelosthege merged commit bae5087 into pymc-devs:main Apr 24, 2022
@OriolAbril
Copy link
Member

Thanks for getting this started @soma2000-lang, I added the test to get the PR merged as having this will make my life much easier. Happy to go over the test if something were not clear or help you find other issues to work on.

fonnesbeck added a commit that referenced this pull request May 6, 2022
* Refactor get_tau_sigma to handle lists

* Fixed syntax error

* Black formatted

* Change assertions to ValueError raises

* Prior predictions constant data (#5723)

* support return constant_data with prior pred

* pre-commit format

* added test

* code review

* Fix issue probably-meant-fstring found at https://codereview.doctor (#5726)

* Add coords argument to pymc.set_data (#5588)


Co-authored-by: Oriol (ZBook) <[email protected]>

* remove MultinomialRV override

* ⬆️ UPGRADE: Autoupdate pre-commit config

* Group GaussianRandomWalk tests in single class

* Infer steps from shape in GaussianRandomWalk

* Unpin upper limit on numpydoc

Closes #5401

* ⬆️ UPGRADE: Autoupdate pre-commit config

* Standardize docstrings of input dists arguments and add warning about cloning

* Remove unnecessary tag in Simulator logp

* Replace deprecated tag.ignore_logprob

* Group compile_pymc tests in own class

* Remove remaining uses of default_updates in codebase

* Remove redundant/wrong docstrings from GaussianRandomWalk logp

* Add moment to GaussianRandomWalk and fix mu/sigma broadcasting bug

* Do not create temporary SymbolicDistribution just to retrieve number of RNGs needed

Reordered methods for consistency

* Move SymbolicDistribution docstring to body of class

* Refactor AR distribution

* Deprecates AR1 distribution
* Implements random and moment methods
* Batching works on the left as with other distributions

* Rename `pandas_to_array` to `convert_observed_data`

* Obtain step information from dims and observed

* Make AR steps extend shape beyond initial_dist

This is consistent with the meaning of steps in the GaussianRandomWalk and translates directly to the number of scan steps taken

Co-authored-by: TimOliverMaier <[email protected]>
Co-authored-by: code-review-doctor <[email protected]>
Co-authored-by: Somasree Majumder <[email protected]>
Co-authored-by: Oriol (ZBook) <[email protected]>
Co-authored-by: danhphan <[email protected]>
Co-authored-by: twiecki <[email protected]>
Co-authored-by: Ricardo <[email protected]>
Co-authored-by: Ravin Kumar <[email protected]>
Co-authored-by: Michael Osthege <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add coords argument to pymc.set_data
5 participants