Skip to content

Improve syntax violations interface #430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

dekellum
Copy link
Contributor

@dekellum dekellum commented Jan 22, 2018

This makes programmatic use of syntax violations more practical while further centralizing this concern in the parser, which should be a maintenance win. An Rc<Fn> is introduced internally to satisfy borrow rules while improving ergonomics. The presumed minimal additional cost of this is only incurred if the syntax violation feature is used. Since existing public interface is maintained and changes are internal, I believe this meets requirements for backward compatibility, and testing agrees, but reviewers should definitely cover that aspect.

Summary of changes

  • New SyntaxViolation enum with descriptions the same as the prior violation static strings. This enables programmatic use of these violations by variant, e.g. ignoring some, warning on others, etc. Enum implementation uses the same macro expansion pattern as was used for ParseError.

  • New syntax_violation_callback signature in ParseOptions and Parser.

  • Deprecated log_syntax_violation and re-implemented as adapter to new callback, passing the same violation descriptions for backward compatibility.

  • Unit tests for old log_syntax_violation compatibility, and new syntax_violation_callback including a tested doc example.

  • Includes rustdoc for old and new interfaces including example usage of new callback, as requested in Add examples to ParseOptions #308.

Below items edited in light of discussion:

  • Improved ergonomics for ParseOptions and syntax_violation_callback. ParseOptions::parse now takes &self, so ParseOptions can still be used repeatedly. The new syntax_violation_callback is passed Fn by value and can be passed inline in user code.

  • Potential breaking changes: ParseOptions is no longer Copy, and ParseOptions::parse takes &self (was self, with Copy).


This change is Reviewable

* New SyntaxViolation enum with description the same as prior strings
* Replaced Fn signature in ParseOptions and Parser
* Adapter for log_syntax_violation to be backward compatible
* Improved ergonomics for use of closure, SyntaxViolation
@dekellum
Copy link
Contributor Author

dekellum commented Feb 2, 2018

Any chance I could get a review on this? @SimonSapin?

@SimonSapin
Copy link
Member

Sorry for the delay. I haven’t looked at all the details yet but one thing I notice is that removing ParseOptions: Copy is a breaking change. We’ve accumulated enough desired breaking changes that I’d like to batch them into a 2.0 version of the url crate at some point, but this is effectively blocked on servo/servo#19825.

@dekellum
Copy link
Contributor Author

dekellum commented Feb 2, 2018

Thanks. I think I understand your point, in that removing Copy (keeping Clone) can, hypothetically, lead to breaking consumer code that happens to rely on implicit copy. Please note however that this did not break or require changing any internal, test, or example usage of ParseOptions. Particularly given that ParseOptions is not commonly used, and was not reusable (in current release, prior to this PR) it seems unlikely that consumer code would be copying it?

The pendulum of concern for breaking changes seems to have swung fairly far from center with Rust libs in general of late. I think I may have just missed that memo.

@SimonSapin
Copy link
Member

CC @nox, for opinions about SemVer

@nox
Copy link
Contributor

nox commented Feb 2, 2018

Definitely a breaking change.

@nox
Copy link
Contributor

nox commented Feb 2, 2018

You could do this instead:

pub trait SyntaxViolationSink {
    fn syntax_violation(&self, violation: SyntaxViolation);
}

impl SyntaxViolationSink for Fn(&'static str) {
    fn syntax_violation(&self, violation: SyntaxViolation) {
        self(violation.description())
    }
}

And then make the methods be:

pub fn log_syntax_violation(self, new: Option<&'a Fn(&'static str)>) -> Self {
    self.syntax_violation_callback(new)
}

pub fn syntax_violation_sink<F>(mut self, new: Option<&'a S>) -> Self
where
    S: SyntaxViolationSink
{}

Btw @SimonSapin, why is this an Fn and not FnMut?

@nox
Copy link
Contributor

nox commented Feb 2, 2018

Oh right silly me, if it were an FnMut it wouldn't be Copy anymore anyway.

@nox
Copy link
Contributor

nox commented Feb 2, 2018

And in my example of code, there is no reason to not use a trait object in syntax_violation_sink:

pub fn syntax_violation_sink<F>(mut self, new: Option<&'a SyntaxViolationSink>) -> Self {}

@nox
Copy link
Contributor

nox commented Feb 2, 2018

And finally, Copy is definitely a breaking change because ParseOptions::parse take self. You wouldn't be able to parse 2 URLs without cloning anymore.

@dekellum
Copy link
Contributor Author

dekellum commented Feb 2, 2018

Thanks @nox, very educational. I tried to convert from Fn to FnMut as part of this. I found it difficult in combination with the builder pattern, where ParseOptions should not mutate. Or do you have a suggestion surviving your own scrutiny? I don't fully follow your thoughts.

Regarding your last point. As part of this PR, I also changed parse to take &self so that it can be reused and because that is the more recommended builder pattern, I believe, made possible by Rc<Fn> replacing the &'a Fn. But your comment also made me realize (confirmed by back-porting my tests, cfe75b9) that Copy made ParseOptions reusable previously, and thus this PR just preserves that. I will fix that in my summary.

Finally, I agree that this is breaking, for hypothetical edge cases, but does it require a major release? RFC 1105 seems to offer pragmatism, though this case isn't specifically addressed:

https://github.com/rust-lang/rfcs/blob/master/text/1105-api-evolution.md

all major changes are breaking, but not all breaking changes are major.

@nox
Copy link
Contributor

nox commented Feb 2, 2018

The question about FnMut was just me digressing, as you discovered the structure cannot be Copy if it wants to use an FnMut, so it cannot be done.

There is a path to avoid the breaking change entirely, so it can definitely be a minor bump.

As for making parse take &self, that doesn't solve the issue that someone could be passing around ParseOptions by value in their codebase and that breaking change would break their code if they reuse it anywhere afterwards.

@dekellum
Copy link
Contributor Author

dekellum commented Feb 5, 2018

Thanks @nox and @SimonSapin. I did find a path forward, resubmitted as #433, which avoids these breaking changes. It doesn't allow for a simple closure wrapping old log Fn to a new callback, but it wasn't as bad as I originally imagined (and what led me to the Rc approach and removing Copy). I hope with the breaking change issue resolved, you will be able to review any other necessary aspects.

@dekellum dekellum closed this Feb 5, 2018
bors-servo pushed a commit that referenced this pull request Feb 16, 2018
Improve syntax violations interface and preserve Copy

This is an alternative implementation of #430 which avoids introducing an `Rc<Fn>` and preserves `Copy` for `ParseOptions`.   Thus, based on my current understanding with the prior review, it is completely backward compatible and with no breaking changes.  Its a little less elegant in that, without the `Rc<Fn>`, I can not simple wrap the deprecated log_syntax_violation Fn with an adapting closure.  But by moving the two possible function references into a new enum `ViolationFn` it is reasonably well contained.  The interface for users is effectively the same, so I think the backward compatibility advantage makes this preferable to #430.  Updated summary:

### Summary of changes

This makes programmatic use of syntax violations more practical while further centralizing this concern in the parser, which should be a maintenance win.

* New `SyntaxViolation` enum with descriptions the same as the prior violation static strings.  This enables programmatic use of these violations by variant, e.g. ignoring some, warning on others, providing translated descriptions, etc. Enum implementation uses the same macro expansion pattern as was used for `ParseError`.

* New `syntax_violation_callback` signature in `ParseOptions` and `Parser`.

* Deprecated `log_syntax_violation` and re-implemented, passing the same violation descriptions for backward compatibility.

* Unit tests for old `log_syntax_violation` compatibility, and new `syntax_violation_callback`

* Includes rustdoc for old and new interfaces including tested example usage of new callback, as requested in #308.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/433)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants