-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: Vendor specification and experimental repository fetch code #13517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Responses welcome. |
@kardianos thanks for including me here. /cc @davecheney |
For anyone reading this I want to provide some background material. While I develop on Glide this is less about my opinions (in this comment) and more I'd like to make sure to add contextually relevant information.
I do ask that anyone who jumps into the discussion on this with opinions take a little time to come up to speed on this space. Outside of Go the specs and tooling are a fairly mature topic. This is also one of those topics with an impact on developer experience so it's worth looking at that as well. While I have my own opinions, which I will detail soon, if anyone has questions or pointers aside from my opinions I'm happy to inform. I'd like anyone who wants to discuss the topic to be well informed on the space. |
Yes, this would be fantastic, currently there are many different file formats, to list a few: All the examples are remarkably similar. To me it seems like an import path and revision hash/tag are all that are necessary, although others probably would like something more complicated. This is why I opened #13483, because for me getting a dependency at a specified rev using standard Go tools is all I want. The capability to easily create the simple Godeps (gpm) file is almost in the go/build and vcs packages already. What we still need are:
|
I am +1 wrt this. I've spent some time basically re-implementing parts of |
It would be great for tools to use the same vendor spec. I thought that was the goal of the vendor-spec work. I am concerned that tools are not already using it. We've said that's what we want tools to use, it's there for using, and yet they are inventing their own. Why? Perhaps vendor-spec is not good enough? |
@kardianos, there's not enough detail here. You wrote "I propose specifying a single file format that will describe packages sourced outside the project repository." That's the vendor-spec, right? Yes, we think there should be just one, but we really want the tool authors to converge semi-organically rather than mandate something. We've done a bad job here at mandating in the past (basically what I wrote in my last comment). But then you wrote "I also propose adding a packge to the golang.org/x/exp repository that discovers, reads, and optionally downloads third party packages." I don't know what this means. More detail needed. |
First, I'm glad we're entertaining this conversations and thanks to @kardianos for putting in a bunch of work on this. I have a number of concerns over the data structure outlined here. I believe it is insufficient for our needs. Let me explain.
These are just a few of my concerns. I really want to see something that allows for:
To illustrate the needs I've collected a number of use cases that need to be satisfied by any spec. I understand that a number of people come from C/C++ here. Other languages, where many Go developers are coming from, have already solved many of these problems. I wrote up how they handle a number of common cases. Building something with a similar experience or one they can understand with that background would be useful. Note, in full disclosure I worked on a competing spec attempting to solve these use cases. This data structure is what Glide is roughly moving to and is influenced by our work there. |
@rsc, yes, this is effectively the goal of vendor-spec. As you noted, I haven't seen convergence on a single spec for vendor packages. Perhaps another way to phrase this proposal is "give tentative blessing to a format from the go team and and ask for feedback from tool authors". I'm completely aware this is putting the cart before the horse. I've asked for feedback in the past on why tool authors couldn't adopt it. I've heard:
To address the second point, I propose adding the (probably poorly named) "tree" parameter that says, everything under this point is also included. It could be the vendor-spec isn't good enough; I just don't know in which way's it is deficient. At this point I'm not sure if the existing variety is due to a lack of consensus or just lack of caring to change existing and working tools. Thus if it was proposed that command like "go get" read and used the vendor-spec file (not 100% a good idea), then I think many more people would care about having and using a common format. As it is, it is a nuisance when exploring or auditing many different go packages, but not a complete show stopper; they are all machine readable and they all contain the same information and many large project have Makefiles that hide which vendor tool they use to some degree. RE /x/exp/ package: You're correct, more detail would be needed. Mainly here to say, this proposal would be of two parts, a spec and a package that handles the spec. What that API looks like would need to be defined. I would love to add this if the fate of this proposal gets to that point. I suppose what I could try to find out next is why vendor tool authors not using this:
@freeformz I think is open to using something like this I'll try to ask around. |
Agree. For distributed vcs the
The vendor-spec defines the content as everything that is or should be in a single level "vendor" folder. I think that should be sufficient for a lock file, correct?
I'm only interested in specifying what we know as the lock file. I think the "version >= 1.2.3" would be fine in a different config file.
Go get handles this with probing. I'm also fine adding a well known optional field that specifies the vcs type ("git", "ssh+git", "hg"). I don't see this as a show stopper.
I'm not sure I understand your concern. If you have or want a package in the vendor folder, have the tool write down the package path and revision in the vendor-spec file and it will be captured. Could you help me see what I might be missing? To be concrete, in
Sure, I would choose to use a CLI command in
This is tool specific, not spec specific. I'm working on adding this to
I'm not sure what you mean by renaming. Origin? Multiple VCS can be handled just fine, that's a tool issue. Private repos is worth talking about, but it might be handled with a stored ssh key and saying, "use ssh"? But again, I don't see a conflict with the given spec. To make sure we are talking about the same thing, I will copy and paste in the glide.lock file for glide and the vendor.json file for govendor: glide glide.lock:
govendor vendor.json
We are really talking about lock files, not a package specification. In other words, I don't think your pkg spec and the vendor-spec are competing, they are doing completely different things. Your glide lock file is pretty much exactly what the vendor-spec is trying to do as far as I can tell. There are corner cases to discuss, but every tool that I've seen has something like a lock file that contains an import path and a revision (a hash if using a dvcs). Perhaps we can't agree on all the other meta data, but maybe we can at least write those two bit of info, and maybe a few others into the same machine format. |
@kardianos thank you for clearing some things up. I think it would be useful to clarify that you're attempting to create a lock file rather than create a package specification. The current title says, "Vendor specification". With that in mind...
Note, I'm asking on 2 and 3 because they do not fit into the use cases I've previously worked out. Trying to understand the details. My issues at a high level, and I'm sorry I have to be so brief as I have to go for now, are...
In the goal to solve what's needed for package management for the majority of developers or is it to do one small slice of the puzzle that others still need to build on? |
End Goal: A tool provided with the vendor spec file should be able to fetch all packages at a given revision from their original repository (if available). This would enable standard user tooling for fetching remote packages at a given revision. This also enables machine analysis of dependencies across the board, such as looking for vulnerable revisions (dvcs hashes) and mapping dependency usages. ... Revision Time: I've worked on projects that are 15+ years old. Code bases sometimes lose touch with original source and sometimes I just want to know what year or decade it is from. Comment: JSON sucks, but is simple and well supported. If you want to write down a comment, with a tool or by hand, put that human note there. By itself JSON doesn't support // comments, so turn in to fields. As per the spec, all unrecognized fields are required to be persisted by other tools modifying a file. In other words, extensions are expected. ... While I don't want to quibble with words, it is called the "vendor specification" because it is specifying vendor package revisions. It isn't called a package specification because it knows nothing about the package it is used in. You are correct. The vendor-spec file lives in the vendor folder and talks about the vendor packages. If you want a package meta-data file it should live in a package directory and tell you about the package folder it is in. The goal is to write down what revisions is in the vendor folder. What you call a lock file. If the source of the package isn't from the the "go get" location, then it provides the |
FWIW: I have not added support for the vendor-spec to godep because I've had to work on other things instead and it hasn't been a priority (i.e. users aren't asking for it). I do want to support the vendor spec but instead have been working on replacing our use of |
@kardianos another problem that would be useful to address would be to move the vendor-spec outside the @freeformz it sounds like you'd like to take the concept of Godep and move it into the Go toolchain. While Godep has been around for awhile and been able to fill in a number of use cases, there are numerous use cases people have been asking for that cannot be easily implemented in its flow. I would prefer to see something that enables those as well. To map dependencies well is a problem in this setup. For example, a lock file should really only be at the top level and not throughout the tree. Dependencies shouldn't be in multiple If a vendor-spec this is present throughout a tree there can be cases with many instances of a common dependency at different versions all mapped by commit id. This doesn't allow automated tooling to work out the best version to use or map a tree. This can be a problem in practice. For example, if you look at kubernetes the same dependency can be referenced many times in packages and sub-packages all to different commit ids. Resolving versions becomes difficult. In tool chains for other languages a lock file isn't used to figure out or map the tree. Instead this is a config file knowing more (e.g., a semantic version range). |
How would tools used by |
Responding to some of the concerns/issues raised above:
_edit:_ uh, oh; I've glanced over the changes since Jun 12, and my initial impression is that I think I wouldn't be able to build my tool with the spec as it looks today, unfortunately :(
Personally, as of now I'm not really convinced this is actually needed/useful. What if the repo owner changes the VCS used? And even if not, I'm not quite sure why one can't autodetect it the same way as the go tool does this. But even if I'm wrong in this regard, the vendor-spec specifially allows to add any custom fields to the JSON file, so I don't see why a tool couldn't just go on and do that?
Uh, that's exactly what I'm doing with https://github.com/zpas-lab/vendo using vendor-spec; thus I believe it is totally easy to do with vendor-spec; did you have some specific trouble with that, could you elaborate?
One recent event that I believe is a perfect illustration of how RevisionTime is awesome is the migration of code.google.com projects to github. Given that it often involved migration from hg to git as a side effect, you're effectively losing the information the hash ID gave you (that is, the Revision field becomes useless), but the RevisionTime should stay perfectly relevant. Thus giving a trivial way to find a corresponding commit in the new (github) repo, and also to check what new commits were introduced since last time you checked/pinned. |
@kardianos If I could snap my fingers and vendor-spec would be supported by glock I would do it, but glock has been stable / unchanged for a while, I haven't had a need to do it, and it seems like a lot of work. But also, I think that the manifest format is not all that differs between tools - for example, glock supports commands, it only supports lockfiles for the end user's application (not for intermediate libraries), and it doesn't vendor dependencies. Seems to me that the vendoring zeitgeist ended up at a tool that is nothing like glock, so I didn't see much of a point in trying to keep up. I'm looking forward to a tool that finally gains widespread adoption though! Seems like "gb" is in the best spot for that? |
I have several issues with the original proposal, @kardianos. Two are architectural, and one is polemical.
But not even the experiences are relayed. This seems odd to me for one very clear reason: Versions are not an attribute of packages, they are an attribute of repositories. Therefore, at even the most rudimentary, vendor spec suffers from what is called "level confusion" -- the assigning of attributes to the wrong "level" of abstraction. This is clearly evidenced by the fact that the proposed file format would allow setting different versions to two co-located packages. Doing so would clearly allow unintended side effects and difficult state resolution.
Given that your proposal does nothing to address this case, and that you are conflating "tool" and "file format", this seems to me to be more FUD than useful commentary. Not to mention that at least one of the tools that you point out in your proposal has handled that situation elegantly since its inception.
|
There was one comment I wanted to make in this thread, and luckily @technosophos has just made an identical comment just now. So I will quote it and say I agree with it:
At the very least, I would want to hear good arguments for doing it another way. But assigning versions to repo roots seems like the most natural and effective solution. |
@mattfarina RE Vendor file location: I'm fine either way. I think @rsc wanted to keep all the vendor stuff in the vendor folder, including the vendor file. If you want a gitignore line that ignores the content of the vendor folder but not vendor/vendor.json file, use @mattfarina and @robfig RE commands: I'm open to suggestions here but how @robfig RE existing tools: Yeah, I understand. It might be easier to let projects move off of it if they choose to. I hear @technosophos and @shurcooL RE recording at the package level: I agree I'm the odd man out on this and as such might lose :). But I will try to explain my rationale. Let me break this down into two parts:
I have a package that vendors files from vitess. Now vitess is a large repository and I only want two packages out of the entire thing. For this I would like to specify which two packages I want and leave the rest behind. For this I need (1). Point (2) is mainly due to this: you have a stable package There are times where you want to note an entire repository or sub-tree for either C files, resources, or maybe that's just how your tool works. That is why I am proposing adding the I use property (1) all the time and like selecting out package from a repo. I would like to retain (2), but I do understand objections to it. I would be interested in other's opinions on this too (try out @technosophos RE origin / std library patches: This is not FUD. This is an example from the
This allows representing the package import path is @technosophos RE existing formats: Early this year the core developers stipulated that the manifest file should be able to be reasonably read with the go std library. Either we create an ad-hoc format, or we use something kinda gross, but well supported like JSON or XML. The fields that the glide.lock file has by and large seem fine. I'm not sure if relevant, but the vendor-spec didn't come from govendor, govendor came from the vendor-spec. So the glide.lock file looks fine, but not yaml. That is a huge format to support and isn't in the std library. @akavel RE changes to spec: I'm glad it is useful. Yes, before it noted down the relative path from the vendor file, so you could place it many places and have it resolve. It has been locked down some to just the vendor folder. The current method is slightly simpler but more restrictive. I'd love to hear other's thoughts on the matter. Relevant issue: kardianos/vendor-spec#39 @akavel RE additions: That was added early on as a suggestion. So yes, that is encouraged. |
@kardianos a few things and i'll break them into bullets for easier reference:
@kardianos @rsc Something occured to me while reviewing this material. As I work on Glide, watch requirements come in there, and discuss package management with those inside and outside the Go community I realized that this "spec" isn't born out of experience in the community. We're still learning what people need and are adapting. The I would suggest waiting for the |
@mattfarina RE use cases:
@mattfarina RE go:generate tools: I'm all ears. What would you propose instead? Are you wanting a bin folder in the vendor folder or something similar? @mattfarina Re experience: Most of us have many years of experience vendoring of some type in go. We are transitioning to the vendor folder now. If you want a varied experience and use cases, let's talk to people now, not later. @goinggo Bill, thoughts on this? You interact with many more people than I do. |
@kardianos thanks for sharing your use cases. That helps me better understand where you're coming from. I have some comments on there.
I don't have a proposal for Package management in most programming language ecosystems have settled on one tool. Rust has Cargo. PHP has Composer (formerly Pear was used). Node.js has npm. You get the idea. The Go ecosystem has become quite divided. The tooling supplied by Godep (a longtime solution) was insufficient for many. Now the ecosystem is fractured and a wide array of solutions are being worked on in order to meet all the use cases people have. If the vendoring setup you talked about worked for the masses GB and Glide would not be gaining the following they are and there would have been no call for them in the first place. Those are just two of the many tools being created. Trying to push this solution in without adapting to meet the needed use cases for many could cause further community issues. Package management has become a hot topic. Many of the elements of the @kardianos Could you possibly expand on you this spec would be needed for my use cases? Then I could see how it would fit into the broader package management situation. |
@mattfarina FWIW: I have no intention of wanting to take the concepts of godep and move them into the Go toolchain. I was stating that I can no longer rely on the go tool chain and am +1 a library so that I don't have to maintain my own internal versions of the tools as libraries. |
I share a bunch of those use cases with you, but I do not want to support Missing from that list are "I'd like to vendor a related tool (some other I also need to know a little about the user's Go environment (mainly ATM) Beyond that and generally speaking though my main use case is I want a tool My main response to this thread was a "+1" for a common library for tools On Tue, Dec 29, 2015 at 3:46 PM, Matt Farina [email protected]
Edward Muller |
BTW: WRT version ranges and updates ... If two packages (a+b) rely on the same separate package (p) and p makes a new release, you need to do integration testing when you upgrade your copy of p. Anything else it just hoping it will work. When doing ruby in the past (and other languages as well), I hated having to update a dependency because it didn't really matter what the released version number was in the end. Yes, the version number gives you a clue / hint wrt compatibility, but that's it. Because of that I'm +1 wrt version numbers (semver specifically), but in the end it just doesn't matter that package a uses version 2.4.1 of p and package b uses version 2.4.2 of p. 2.4.5 of p was released and it needs to be re-validated to work with both the version of a and b that you have. I've had to patch/upgrade either package a and/or b to work with the new p (which for arguments sake fixes a bug that I'm experiencing) more times than I care to reflect back on. Also, just because p released 2.4.5 doesn't mean I need to upgrade any code I have using package p to the new version. I may need to (because of the aforementioned bugfix example), but that's on a case by case basis. After reading this entire thread again I can understand why the use cases call for version ranges. However I still do not believe they are necessary in go, when using tooling like |
Note: using something like govendor + vendor/ you would only have a single copy of "p" in use anyway, so there wouldn't be a state where a was using p @ 2.4.1 and b was using p @ 2.4.2. When you vendored them you would pick a version to record+copy or the tool would error and you would have to resolve it. |
Yep, that's all they do. Wanting any more from them is a poor expectation in the first place. That does not make them unuseful. It just means that instead being a tool for the machine to use in making a final decision about what works, they're a tool to help you go through the process of figuring out what works.
Any proposed solution, including one with version ranges, still requires you to make such a choice. The only difference is that, when version ranges are permitted, the machine can help you with that choice; it needn't just make it for you and pretend everything is fine. That choice is hard. It will always be hard. The real benefit of the ranges is the additional information such ranges can express when you're dependent on a library (or two, or three) where one of these situations arises. If all they have is the commit id that they're pinned to, you (typically) have no idea why they're pinned to that version, and so have to go in and understand their code well enough to figure out whether or not you can move them to a different version. If, however, they can specify a version range, then you're taking advantage of the fact that their knowledge > your knowledge when it comes to their library. Again, these are difficult decisions - I can't understand why you'd want less information on hand to resolve them. Sure, it's possible that the lib author did a bad job and put in an unhelpful or incorrect version range, but:
|
@sdboyer I'm not sure how version ranges helps the machine make that choice? Version ranges are not required for the machine to help me make that choice. If I can fetch the current code, or any arbitrary revision after the one I have recorded, via vcs then I have everything I need to determine compatibility (aside from a tool to do it). I do think people need to version their packages / repositories though as it will help a developer make a decision when that tool says that versions (or revisions if version information is missing) 1, 2 and 3 are compatible, but 4 and 5 aren't because the public interfaces / structs / function signatures that the developers code is using have changed. |
Quoting from the YC post that @sdboyer linked:
^^ That's the same point I'm making. It's the same conclusion that Composer reached for PHP. It's the same conclusion that Ruby reached. It's the same conclusion that the Glide team reached for Go, after fighting that conclusion for a while. So if the languages that have built successful packaging tools have all reached the same conclusion (version range manifest file on libraries, pinned lock file on applications), what about Go is so inherently different that it shouldn't adopt a known-successful model? I don't mean about Go's status quo today (the status quot is obviously insufficient in this regard or we wouldn't be having this conversation), but what is intrinsic to Go that makes it so different? That's what I don't get. When we know there's a model that's proven to work, gives everyone the flexibility they want/need, and and solves the problem space successfully, why wouldn't we go with that and benefit from everyone else's experience? (Inquiring baby Gophers want to know!) |
To me, in the end, it's about API compatibility, which computers are much better at figuring out than humans. With Go I believe we could use code analysis to determine API compatible versions and then let the developer choose which one to vendor instead of guessing. I'm fine with package meta data version ranges if it's just used by developers to provide hints on what to choose to do during a conflict: Will I need to edit my code, my deps code, their deps code or some combo of all of them. In the end whatever standard gets adopted I'll have to support it, so here's hoping I've made my case. Here is my original response: https://gist.github.com/freeformz/bd0d167dece99e210747. I aborted it though since I felt we'll just keep talking past one another. |
Two more related use-cases:
|
@Crell I agree that applications should pin/copy and "libraries" (packages) should use version ranges. I agree that it is good if packages are released. The difference is static analysis and GOPATH. If the application should pin dependencies, then a design file isn't required for application, just the revision and specific version it uses. If the "library" should contain version ranges it should have a version range for each dependency it uses. Now let me constrain the problem of version ranges into two categories: (1) "I want my package to use a compatible API", and (2) "I want my package to use all the required features it needs". (Remember your engineering design class, user stories must not contain a technical implementation or technical requirements). In Go you can denote API compatibility with either a unique import path or a "major" release tag. In order to satisfy compatibility, you cannot remove a feature or API once added. If package authors choose to give a unique path to each "major" release, the feature set is a function of the statically knowable API or just the revision time. If a package author just uses a tag, then all we need to know is what the version tag is currently to know the major version we need. And if we can just use the current version as a range spec, then that is machine discoverable, again removing the need for a human editable design file.
If a package author really had an exceptional amount of knowledge of a needed package version range or wanted to blacklist a particular version, it would be trivial to add a field with a well defined interpretation of that field for human use that could be presented to any down-stream users of the package. The main difference between glide and what I'm proposing here is I'm letting the machine do more of the work. If you want to write the design file yourself for everything, that seems silly to me, but again fine. I continue to see no technical reason why we could not write versions and version ranges to the same file. |
@kostya-sh - re: binary deps, my gut is that that's mostly, though not completely, orthogonal, as we've mostly been focused on getting and arranging source code here. I'd have to research that more, though. If I'm understanding your first use case, then yep, that makes a lot of sense. If I'm understanding the second use case, then I have the same question as I've asked before: why do you care about getting rid of code that the compiler is going to ignore, anyway? I think our positions are actually quite close, though yes, we're talking past each other. That's at least partly my fault - I was assuming the disconnect was over a lack of understanding as to what performing a resolution with a range would actually look like, and so was trying to clarify that. But, looking at your gisted response, I think maybe we've reached the kernel of it:
Sadly, code analysis + revision history cannot do that. (If they could, I'd agree with you - no question, they'd be the way to go) At best, they can determine that code is not incompatible, not that it is compatible. Annoyingly, these are different things. Here's an example. All of which should be taken to mean that static analysis is certainly helpful, but not sufficient, for answering this question. Trying to make it sufficient brings you into a full-contact brawl with type theory (on which I'm still quite a newbie) as you try to compute type equivalencies. That's not what Go's type system was designed to do - but it IS a goal of Hindley-Milner-like type systems (of which some variant is used in langs like Rust, Haskell, OCaml, SML). So yes, Go is different: its type system is simplistic, but sound, and that was very much the goal (as I understand it). Trying to do too much more will be swimming upstream against the design. The reason I advocate for version ranges is because they are a sufficiently flexible system to accommodate both the helpful insights from the static analysis you want, and the insights about logical compatibility that an upstream author is more likely to have. Run your tool, and encode the results, along with whatever else you know, into a version range. We're talking past each other because we're imagining...well, I guess different workflows, though I'm loathe to call it that. The article I'm writing tries to break it down into necessary states and necessary phases, largely without regard for worfklow. We'll see how that pans out. |
Yep, probably could. But "could" isn't the question. "Should" is the question. |
As @mattfarina mentioned many times it is important that the spec addresses as many real use cases as possible. This is a real use case describing how some developers vendor their dependencies (vendoring sync2 package from vitess repository has been described in this issue discussion). Besides many golang.org/x repositories contain multiple packages that can be used independently (e.g. I guess the main reason for doing this is efficiency. If I decided to check-in vendored dependencies to my application repository I would rather check-in 100kb client library than whole 10Mb of the source code. Additionally some VSCes (e.g. Subversion) are quite efficient at checking out a single directory (unlike Git). This might speed up build times in cases when vendored dependencies are checked out at build time. It is also not very difficult to come up with a scenario when checking out the whole repository simply won't work. E.g. if I want to use two different packages from the same repository pinned to different revisions. To be honest I don't care too much how the final spec will look like but it would be unfortunate if some of the use cases I described wouldn't be covered. |
Semver may not catch any of those, but code analysis would at least catch the v2 issue (as you stated code analysis can only tell me what's incompatible). Tests, as you, me and/or others have pointed out above would be required to catch the v3 issue, semver or not. This is the crux of our disagreement AFAICT: You have faith in semver being meaningful beyond stating intent. I don't. In my mind semver is just intent and I would prefer to consider actual API changes and leave the rest to integration testing. We both view the world very differently apparently. Your article will be an interesting read for me I'm sure. :-) I would love to get some sort of higher throughput (video / in person / etc) discussion wrt this issue. It's obvious that we all care deeply about it. Barring that I'll probably start bringing it up with every go developer I cross paths with. |
I love the great conversation over the past couple days. @freeformz I agree that some form of video, in person, or other better method of discussion would be useful. Let's see if we can figure out how to get that going. I'm happy to start figuring out the logistics of that. To add some thoughts to the ongoing commentary:
In this problem space there are, at least, a couple distinct roles. Those who produce a package and those who consume it. If I were going to prioritize them I would prioritize the consumer slightly over the producer. What do y'all think of that? |
|
@kostya-sh - ah right, yes, sorry. I'm always going to struggle with splitting up an upstream repository, because it undermines commit atomicity of the upstream repository - and given how hard a problem space this is to build something both sane and usable, I like taking advantage of every bit of upstream information we can get. I don't think I still struggle with the performance argument, though. It seems to me that exploring caching more would be preferable over carving up what amounts to generated code. Particularly for Go, where it's not necessary to fetch those packages beyond the build server (unlike an interpreted lang). And if the build server is ephemeral (e.g., hosted CI), at least some of them provide support for caching across ephemeral instances. So, I can entirely see being convinced about it. But some (not all) of what I've seen about that so far seems to amount to complaints that "the tool doesn't currently do as well as I can manually." Well, of course not. But...cmon. Disk is very cheap. Network is relatively cheap. There is a point where it becomes preferable to eat it on those in order to reduce complexity of a real implementation.
And even tests aren't sufficient, of course (Dijkstra: "Testing can only prove the presence of bugs, never their absence!"). But yes, you're absolutely right - semver ranges carry no guarantees whatsoever. They could be completely right, or completely wrong. What's important is that they're not mutually exclusive with static analysis. If you're working and pull in a new project (A), which in turn has a dependency on another project (C) specified in a range, but you already had another dep (B) which also had a dependency on C, then when attempting to resolve the diamond, your tooling should ABSOLUTELY run static analysis on the A->C relationship to ensure that all the versions the semver range indicates are acceptable, actually are. Because yes - you shouldn't just take A's maintainer at their word. You'd be no better off than we are now in the unreasonable "just ensure tip always works" world. So, let's say that So you go in, do the work, and figure out that A is actually incompatible with Cv3, but is compatible with Cv4. This work you just did is extremely valuable. It should be recorded, so that no one ever has to do it again. Which you can do by filing a patch against A that further restricts the semver range to exclude v3. And now, when the next user of A comes along, they'll never hit that v3 pothole. They'll never even need to know it exists. (And the FLOSS cherubim sing.) I think we all understand that there's a ton of uncertainty in software development. Superficially, semver may appear to just blithely ride that uncertainty train, or even make things worse. But all it's actually doing is taking a whole lot of awful, complicated shit that can happen, and providing a lens for seeing it all within a single field of view. (If you’re a fan of Cynefin, semver is an excellent example of an organizing system that moves a problem out of the complex space, into the complicated space.) While our individual builds must be reproducible and deterministic, the broader ecosystem will always be (in practice, from any one person's perspective) uncertain. All real software must constantly move back and forth between these two spaces. Semver facilitates a process by which we can inject more certainty into the ecosystem incrementally, as we learn more about it.
Most people do :) Though I still tend to think, in this regard, maybe not so far off.
With any luck! Discussing over here has gotten me enmeshed in too much detail over there now, I think...I'm a bit stuck. Trying to pull back from the trees for the forest. Hopefully I'll have it done soon.
+1 from me. |
My understanding of where we stand is as follows: I would like to try to determine a single file that might allow different workflows to work together using a single format. People on the Glide team don't want that because it would be a suboptimal design, it would be different than other languages, and copying the version range from a tools design file to the standard lock file would "hugely complicate the tool". Here is my response to @mattfarina 's use cases:
So of the user stories you wrote down that relate to this issue, I really don't have a problem with them. I continue to not understand why vendor ranges can't live in a vendor-spec (lock typeish) file for those who wish to use them. |
I've been talking to a lot of people about this, both Gophers and not and of course opinions are all over the place. I think I've come to the conclusion that semver+ranges are important socially more so than anything else. ATM a lot of packages don't release versions and/or change things up drastically on master at times. So basically anything that forces people to think more about releases is ++. With that said, my opinion atm, is that ranges should be limited to non main packages/libraries. |
Not sure if this conversation has moved on elsewhere, but I enjoyed reading through it as it is at the heart of the exact problems I have been struggling to deal with. Feel free to point me elsewhere if it has moved on in the last month. @kardianos I would disagree with that This is precisely the problem I keep having. Our product is using a 3rd party dependency (doesn't even matter really, happens with internal ones too), there is a bug or hotfix in that dependency affecting our product, that has to be fixed immediately and release a new version of our product. One of the typical ways you do this is to fork the dependency, fix the bug, build the product using the forked dependency and release. You then push the change upstream and close the loop later of having your application switch back to the mainline after it is merged. I know there are multiple ways to solve this, but the easiest way would be to update a spec file that says use the following URL (fork) for import X;
I want to be able to make the changes on the forked dependency, make the fork available. Then update the application using the dependency. Ideally, all I should need to do is update a spec file that says, use version blah of this dependency. Since golang ties source, import paths, and other things related to projects so tightly together, it hinders these pivot points that almost every other language provides. This is most evident when it comes into 'what is a version' of a dependency. Because golang ties the import path to the repo home (URL) of the dependency it implies that the version the of a dependency is only within the scope of that repo URL. I believe that to be not ideal. A version of a dependency should be an 'instance' of that dependency, and an 'instance' of that dependency should be able to originate from multiple places, and thus that origin should be part of the scoping of the version. In golang, we are saying that origin should be URL addressable so it can be retrieved as an import. That would allow using forks. |
btw I do realize that the spec formats proposed in govendor and glide both address this origin aliasing capability. Was bringing the point above up more out of that I believe it to be a primary use case for using a manifest file for specifying dependencies. |
Finally finished the article I kept mentioning. |
@sdboyer I finished reading the article you wrote. I'm having a hard time getting past the "LOLZ CATZ" tone in it. There are many assertions of fact. For instance, I believe Dave's proposal was not accepted not because people don't want to encourage semver, but because it wasn't actionable by the any mainline go tool. I commend Dave for the proposal, but presenting Dave as the valiant hero who was shot down without good cause doesn't do anyone any good. I think most of the technical points present in the article have already been presented here. Though from the writing style it is difficult for me to unravel when you are presenting a point of view, an assertion of fact, or a proposal for action; I may have not accurately understood everything you intended to convey. A few responses:
Some of your points don't seem to be founded in actual issues: you have paragraph emotionally targeting people who don't think we need reproducible builds. In the Go ecosystem I don't see that attitude to begin with, so even aside from your tone, there isn't anything to be argued there: we all want reproducible builds at some level depending on our exact needs. You do offer a good summary of different issues present in specifying version ranges and a good point in that the developer can treat them as a suggestion and override them. Thank you for your work on glide. I would encourage you to continue exploring what benefits you can get from doing static analysis on a project's dependencies that can augment or assist a manually created list of declared dependencies. I don't see this issue going forward and will probably close it soon. In govendor this conversation has pushed me to plan to support version ranges despite the pain I've seen them bring. I already plan to support directly fetching remotes and that is closer than it was before. |
There are a variety of strategies out there for getting people to read almost 13000 words. You get to make your stylistic choices, I get to make mine. The substantive points remain.
I think that's an inference you made, not something I said. I simply said that it failed; I didn't say why.
I've amended the wording there to be explicit that it failed because it lacked concrete outcomes, but again, I don't think I actually said that. What I DID say was that it probably wasn't incorrect that it failed. The valiant-ness refers to the willingness to jump into what was sure to be a fractious discussion. I'd ascribe the same to you for this thread, even though I don't agree with your approach.
And I said as much. In fact, I was quite careful about saying it. What I said was that monorepos were harmful for sharing - not that they should be neglected by a tool.
Not much to say here except that I don't think you really understood the constraints presented in the article.
The differences are not so big, as...well, the entire article more or less lays out. But directly to your point: Cargo/Rust. But again, now for the third time, this isn't inconsistent with what I wrote. Right from the outset, I indicated that
Again, now for the fourth time...this is basically the text on one of the captions.
Yep. That's why I didn't touch this in the Go section, but only in the general section. @bradfitz outlined this preference a year ago. It doesn't change my stance on what the right general decision is, of course, but it's a relatively minor issue that would have distracted from main the point. Ironically, using a non stdlib library for tooling is the kind of thing having a proper PDM would make easier.
I do indeed. In part for levity, and in part because, as I was explicit about in paragraph three, the article is targeted at more than just Go. So yes, that is an actual issue - just not for Go.
Nor do I see that attitude. ...and, also, I said as much in the article:
The value of including it all, even the stuff that doesn't immediately narrowly apply to your particular language of concern, is that it can help expand your perspective on what the overall problem looks like. Which was the high-level goal of the article.
Thanks. I'm glad you found that useful.
That's a shame; per the article, my sense is that we could indeed make incremental progress by defining a proper lock file. Perhaps it would be best to start a clean issue for that, though. |
Proposal: Vendor specification and experimental repository fetch code
Author(s): Daniel Theophanes
Last updated: 2015-12-06
Abstract
Establish a specification file format that lists dependency revisions and
a package in the golang.org/x/exp repository that discovers, reads, and downloads
packages at a given revision. Tools may continue to use other formats to generate
this file.
Background
Many developers wish to specify revisions of vendor dependencies without copying
them into the repository. For a case study I will bring up two:
A) https://github.com/cockroachdb/cockroach
B) https://github.com/gluster/glusterd2
(A) uses
github.com/robfig/glock
which specifies revisions for each remote repositoryin file in the project root called "GLOCKFILE". A partial list of the file is:
(B) uses
github.com/Masterminds/glide
which specifies revisions for each remoterepository in a file in the project root called "glide.yaml". This file contains:
I would like to point out a few features these tools provide:
Right now each vendor tool specifies these same properties in different formats.
A common tool cannot be built that reads a single file and downloads the needed
dependencies. This isn't a huge burden on a dedicated developer, but for a
user passing by who just wants to build the source quickly, it is an impediment.
Proposal
I propose specifying a single file format that will describe packages sourced
outside the project repository. I also propose adding a packge to the
golang.org/x/exp repository that discovers, reads, and optionally downloads
third party packages.
Furthermore I propose using the specification found at
https://github.com/kardianos/vendor-spec with one addition as the basis for this
specification. The addition is:
Both the specification and the proposed package will be considered experimental
and subject to change or retraction until at least go1.7. This process will be
done with an eye to possibly adding this feature to
go get
.Rationale
The vendor file format needs to be able to be read and written with standard
go packages. This adds to the possibly that
go get
could fetch packagesautomatically.
Vendor tools exist today that download packages from a specification. They are just
incompatible with each other despite using the same information to download the
dependencies. If we can agree on a single format for tools to write to, even if
it isn't the primary format for that tool, all tools and possibly
go get
candownload dependencies.
Existing vendor tools and their formats don't always handle corner cases or
different approaches. For example current tool file formats can't handle the
case of vendoring a patched version of a standard library package (this
would have been useful for
crypto/tls
forks for detecting the heartbleedattack and for accessing MS Azure).
I am proposing a file format that "govendor" uses. I'm not trying to put my own
tool as central. Infact, "govendor" was built to validate the "vendor-spec"
proposal. The "vendor-spec" has received significant external contributions
and as such "govendor" has changed to match the spec (and will continue to do so).
Compatibility
This will be standardization of existing practices. There is no go1 compatibility
issues. Existing tools can treat the specification as a write only file.
Implementation
A file format to describe vendor packages should be accepted when this
proposal is accepted. Should this proposal be accepted a new package
should be added to the "golang.org/x/exp" repository to support reading
the vendor file and downloading packages. The author of this proposal
offers to create or assist in creating this package. This would be created
within 2 months of the proposal being accepted.
Risks
It would be ideal if other vendor tool package authors could agree to at least
write to a standard file format informally and collaboratively. Indeed the largest
risk is if vendor tools fail to write the common file format. However I think
unless there is a tangible benefit (such as
go get
support) there will continueto not be a reason to collaborate on a standard.
Open issues
The proposed standard file format uses JSON, which might be better then XML, but
harder to write by then something like TOML. Tools that want the vendor file
to be hand created will be forced to generate this file from a different file.
The file format specifies packages, not repositories. Repositories can be specified
by using the root path to the repository and specifying
"tree": true
, but it isn'tthe default for the format. Some people may take issue with that as they are used
to or desire tools that only work at the repository level. This could be a point
of division. From experience I absolutely love vendoring at the package level
(this is what github.com/kardianos/govendor does by default).
The text was updated successfully, but these errors were encountered: