-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Allow overriding module name for uv build backend #11884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
konstin
merged 10 commits into
astral-sh:main
from
csachs:feature/build-backend-module-name
Mar 7, 2025
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
f5dd0f9
Allow overriding module name for uv build backend
csachs 593edbb
Add validation, docs and test.
konstin 24b8c8a
Review
konstin f533804
Reset Cargo.lock
konstin 4cebea8
Update lockfile
konstin 98edd36
Update snapshots
konstin 4efa388
Update snapshots
konstin ee96f0b
Rebase onto main
konstin 7bdf701
serde is not optional in uv-pypi-types
konstin 2ca4cbf
serde is not optional in uv-pypi-types
konstin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
use std::fmt::Display; | ||
use std::str::FromStr; | ||
use thiserror::Error; | ||
|
||
/// Simplified Python identifier. | ||
/// | ||
/// We don't match Python's identifier rules | ||
/// (<https://docs.python.org/3.13/reference/lexical_analysis.html#identifiers>) exactly | ||
/// (we just use Rust's `is_alphabetic`) and we don't convert to NFKC, but it's good enough | ||
/// for our validation purposes. | ||
#[derive(Debug, Clone, Hash, PartialEq, Eq, PartialOrd, Ord)] | ||
pub struct Identifier(Box<str>); | ||
|
||
#[derive(Debug, Clone, Error)] | ||
pub enum IdentifierParseError { | ||
#[error("An identifier must not be empty")] | ||
Empty, | ||
#[error( | ||
"Invalid first character `{first}` for identifier `{identifier}`, expected an underscore or an alphabetic character" | ||
)] | ||
InvalidFirstChar { first: char, identifier: Box<str> }, | ||
#[error( | ||
"Invalid character `{invalid_char}` at position {pos} for identifier `{identifier}`, \ | ||
expected an underscore or an alphanumeric character" | ||
)] | ||
InvalidChar { | ||
pos: usize, | ||
invalid_char: char, | ||
identifier: Box<str>, | ||
}, | ||
} | ||
|
||
impl Identifier { | ||
pub fn new(identifier: impl Into<Box<str>>) -> Result<Self, IdentifierParseError> { | ||
let identifier = identifier.into(); | ||
let mut chars = identifier.chars().enumerate(); | ||
let (_, first_char) = chars.next().ok_or(IdentifierParseError::Empty)?; | ||
if first_char != '_' && !first_char.is_alphabetic() { | ||
return Err(IdentifierParseError::InvalidFirstChar { | ||
first: first_char, | ||
identifier, | ||
}); | ||
} | ||
|
||
for (pos, current_char) in chars { | ||
if current_char != '_' && !current_char.is_alphanumeric() { | ||
return Err(IdentifierParseError::InvalidChar { | ||
// Make the position 1-indexed | ||
pos: pos + 1, | ||
invalid_char: current_char, | ||
identifier, | ||
}); | ||
} | ||
} | ||
|
||
Ok(Self(identifier)) | ||
} | ||
} | ||
|
||
impl FromStr for Identifier { | ||
type Err = IdentifierParseError; | ||
|
||
fn from_str(identifier: &str) -> Result<Self, Self::Err> { | ||
Self::new(identifier.to_string()) | ||
} | ||
} | ||
|
||
impl Display for Identifier { | ||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | ||
write!(f, "{}", self.0) | ||
} | ||
} | ||
|
||
impl AsRef<str> for Identifier { | ||
fn as_ref(&self) -> &str { | ||
&self.0 | ||
} | ||
} | ||
|
||
impl<'de> serde::de::Deserialize<'de> for Identifier { | ||
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> | ||
where | ||
D: serde::de::Deserializer<'de>, | ||
{ | ||
let s = String::deserialize(deserializer)?; | ||
Identifier::from_str(&s).map_err(serde::de::Error::custom) | ||
} | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use super::*; | ||
use insta::assert_snapshot; | ||
|
||
#[test] | ||
fn valid() { | ||
let valid_ids = vec![ | ||
"abc", | ||
"_abc", | ||
"a_bc", | ||
"a123", | ||
"snake_case", | ||
"camelCase", | ||
"PascalCase", | ||
// A single character is valid | ||
"_", | ||
"a", | ||
// Unicode | ||
"α", | ||
"férrîs", | ||
"안녕하세요", | ||
]; | ||
|
||
for valid_id in valid_ids { | ||
assert!(Identifier::from_str(valid_id).is_ok(), "{}", valid_id); | ||
} | ||
} | ||
|
||
#[test] | ||
fn empty() { | ||
assert_snapshot!(Identifier::from_str("").unwrap_err(), @"An identifier must not be empty"); | ||
} | ||
|
||
#[test] | ||
fn invalid_first_char() { | ||
assert_snapshot!( | ||
Identifier::from_str("1foo").unwrap_err(), | ||
@"Invalid first character `1` for identifier `1foo`, expected an underscore or an alphabetic character" | ||
); | ||
assert_snapshot!( | ||
Identifier::from_str("$foo").unwrap_err(), | ||
@"Invalid first character `$` for identifier `$foo`, expected an underscore or an alphabetic character" | ||
); | ||
assert_snapshot!( | ||
Identifier::from_str(".foo").unwrap_err(), | ||
@"Invalid first character `.` for identifier `.foo`, expected an underscore or an alphabetic character" | ||
); | ||
} | ||
|
||
#[test] | ||
fn invalid_char() { | ||
// A dot in module names equals a path separator, which is a separate problem. | ||
assert_snapshot!( | ||
Identifier::from_str("foo.bar").unwrap_err(), | ||
@"Invalid character `.` at position 4 for identifier `foo.bar`, expected an underscore or an alphanumeric character" | ||
); | ||
assert_snapshot!( | ||
Identifier::from_str("foo-bar").unwrap_err(), | ||
@"Invalid character `-` at position 4 for identifier `foo-bar`, expected an underscore or an alphanumeric character" | ||
); | ||
assert_snapshot!( | ||
Identifier::from_str("foo_bar$").unwrap_err(), | ||
@"Invalid character `$` at position 8 for identifier `foo_bar$`, expected an underscore or an alphanumeric character" | ||
); | ||
assert_snapshot!( | ||
Identifier::from_str("foo🦀bar").unwrap_err(), | ||
@"Invalid character `🦀` at position 4 for identifier `foo🦀bar`, expected an underscore or an alphanumeric character" | ||
); | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does wrong if we say an identifier is valid but Python's rules says it isn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python's and Rust's documentation reference different parts of the unicode standard for what's an allowed letter, but I strongly suspect that no one will actually into a case where they disagree.