Skip to content

replace cargo.rs with cargo #855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 5, 2018
Merged

replace cargo.rs with cargo #855

merged 12 commits into from
Jun 5, 2018

Conversation

kngwyu
Copy link
Collaborator

@kngwyu kngwyu commented May 19, 2018

Current racer's file search has many problems.
Sometimes it fails to find dependency(see comments in #551), and it causes too much recursion(#831).
To resolve these problems, I replaced cargo.rs with cargo crate.
For clients which does sync io, this change can cause 2~3s delay when downloading dependencies, but other than that, it works well.

To work with cargo, current tests implementation have some problems.

  • Currently we use env::set_current_dir for test_project, but it doesn't work with cargo.
  • It uses our own TmpDir implementation, which creates directory under racer/. So if a test panics it leaves tempdir in our normal file system. It's a bit annoying.

So I implemented new with_test_project function which copies whole files into tempdir and then run tests in that directory. And I replace racer's TmpDir with tempfile crate and remove sync! macro in each tests. Each test directories have unique names so there's no need to lock threads. Now tests are much faster.

In addition, I want to leave some notes about dependency caching.
Here we have cached dependencies for each manifest files(i.e. Cargo.toml), because racer calls get_crate_file many times in a session.
Compared to @birkenfeld's way in #838, it has 2 differences.

  1. It doesn't have caches of crate roots. It call's cargo::util::important_paths::find_root_manifest_for_wd everytime. This is because I think it can't be bottleneck.

  2. It has dependency cache in Session, not in thread_local global variable. It can cause decrease in speed for clients using racer as library like rls. This is because I think to use thread_local make difficult to provide stateful APIs in the future, but I have no strong opinion about this.

And here's a result of benchmark, though it's not so general(We need more practical bench, but it will be a future task).
Before

test benches::bench_code_chunks     ... bench:      86,014 ns/iter (+/- 5,202)
test benches::bench_iter_stmts      ... bench:     206,687 ns/iter (+/- 35,788)
test benches::bench_mask_comments   ... bench:      30,087 ns/iter (+/- 4,644)
test benches::bench_mask_sub_scopes ... bench:     127,966 ns/iter (+/- 23,547)

After

test benches::bench_code_chunks     ... bench:      87,280 ns/iter (+/- 5,913)
test benches::bench_iter_stmts      ... bench:     222,251 ns/iter (+/- 22,692)
test benches::bench_mask_comments   ... bench:      30,376 ns/iter (+/- 3,099)
test benches::bench_mask_sub_scopes ... bench:     127,343 ns/iter (+/- 14,004)

@kngwyu
Copy link
Collaborator Author

kngwyu commented May 19, 2018

Oh appveyor's TARGET=i686-pc-windows-gnu build fails with an odd error.
@TedDriggs
Could you please take a quick look at build log?

@kngwyu kngwyu requested review from TedDriggs, nrc and birkenfeld and removed request for birkenfeld, nrc and TedDriggs May 21, 2018 08:29
@birkenfeld
Copy link
Collaborator

@kngwyu I was wondering about doing this while overhauling cargo.rs - but I thought that the disruption of updating metadata might be undesired for racer.

If it is deemed acceptable, I'm all for this, a more detailed review will follow.

@TedDriggs
Copy link
Contributor

@kngwyu I no longer have access to a local Windows box. The error suggests that it was looking for make - I don't recall that being present in the past.

@kngwyu
Copy link
Collaborator Author

kngwyu commented May 23, 2018

@TedDriggs
Thanks. I'll try to build again in my account.
@birkenfeld

but I thought that the disruption of updating metadata might be undesired for racer.

Very good point. My implementation allow cargo to write Cargo.lock file. It can be annoying for some clients like RLS.
So I decided to modify implementation not to write lock file, by calling resolve_with_previous directly.
Please do not merge this PR until I finished change.

@kngwyu kngwyu changed the title replace cargo.rs with cargo [Do not merge] replace cargo.rs with cargo May 23, 2018
@kngwyu kngwyu changed the title [Do not merge] replace cargo.rs with cargo replace cargo.rs with cargo May 23, 2018
@kngwyu
Copy link
Collaborator Author

kngwyu commented May 23, 2018

I think there's 3 ways to use cargo for dependency resolving.

use resolve_ws_precisely

Current implementation

  • pros
    • we don't have to build our package to use racer
  • cons
    • sometime racer downloads dependencies from crates.io, and operation delays
    • racer writes ~/.cargo/registory and Cargo.lock

use resolve_with_previous

  • pros
    • same as above + racer don't write Cargo.lock file
  • cons
    • same as above - racer writes Cargo.lock file
    • implementation is a bit complex

use load_pkg_lockfile

  • pros
    • racer writes nothing in our file system
  • cons
    • We have to build our package before trying to complete

Personally, I prefer to first way, because it can reduce build time. And, I think for stand-alone clients to write Cargo.lock and registry is not problematic.
But, for rls, which calls cargo internally, this can cause operation delay, because cargo waits until file lock is unlocked, though it will be at most 2~3s.
@nrc
Do you think allow racer to write Cargo.lock or registry file can be problematic for rls?

@nrc
Copy link
Contributor

nrc commented May 24, 2018

It would be good to test to make sure. I think that modifying Cargo.lock is problematic because we have to watch that and changing it will cause us to recompile (and a full recompile, not an incremental one). I think that modifying the registry should be fine, because we don't watch that. However, if there are significant changes (thinking that the registry gets out of sync with the lock file), if that causes our builds to error out, it will be a problem (to explain a bit, we try and just call rustc, not Cargo, if Cargo.toml or .lock change we run Cargo and flush our caches, but we don't watch the registry, so if something changes there we might need to change the way we call rustc, but we don't realise that, possibly that could cause errrors).

@kngwyu
Copy link
Collaborator Author

kngwyu commented May 24, 2018

@nrc
Thanks for detailed explanation.
I'll try to modify codes not to write lock file and test it with RLS.

@kngwyu kngwyu changed the title replace cargo.rs with cargo [Do not merge] replace cargo.rs with cargo May 24, 2018
@kngwyu
Copy link
Collaborator Author

kngwyu commented May 26, 2018

OMG our new shiny support for patch.crates-io doesn't work with RLS !!!
But other than that there seems to be no problem with RLS, though I'll debug a little more.
It's my misunderstanding. Support for patch.crates-io works and I saw no critical error, but saw some warnings like

WARN 2018-05-26T13:01:36Z: cargo::util::rustc: failed to calculate rustc fingerprint: probably rustup rustc, but without rustup's env vars

kngwyu added a commit to kngwyu/racer-nightly that referenced this pull request May 26, 2018
kngwyu added a commit that referenced this pull request May 26, 2018
@kngwyu kngwyu changed the title [Do not merge] replace cargo.rs with cargo replace cargo.rs with cargo May 26, 2018
.or_insert_with(|| cargo::get_crate_file(kratename, from_path))
.clone()
/// Get cached dependencies from manifest path(abs path of Cargo.toml) if they exist.
pub fn get_deps<P: AsRef<path::Path>>(&self, manifest: P) -> Option<Rc<Dependencies>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this (and others) be pub(crate) like the function it replaces? when I added get_crate_file this was recommended by @jwilm

Copy link
Collaborator Author

@kngwyu kngwyu May 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's reasonable.
I'll modify it.

/// Cache dependencies into session.
pub fn cache_deps<P: AsRef<path::Path>>(
&self,
manifest: P,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use impl AsRef<Path>? not sure in which camp you are 😉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Difficult question!
Sometimes I feel T: Bound is good and sometimes not, but I have no clear criteria.

Ok(f) => f,
Err(e) => {
info!(
"[Session::load_lock_file] Failed to load {:?}: {}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces use of info!() which is not used so far (except for one instance which I believe is oversight and I will make a PR to remove). Is this how we want to communicate errors/warnings?

Does the logging output go to stderr?

Copy link
Collaborator Author

@kngwyu kngwyu May 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference of info! and debug! is just their level. See https://docs.rs/log/0.4.1/log/enum.Level.html for detail. Its destination is defined by logger(in racer or RLS, env_logger).
I used info! here just because I felt it hard to find what I wanted to check because of too many debug! output, when I was debugging with RLS. But this error is not so rare so I'm not sure info! is appropriate level here.

/// Cached dependencie (path to Cargo.toml -> Depedencies)
cached_deps: RefCell<HashMap<path::PathBuf, Rc<Dependencies>>>,
/// Cached lockfiles
cached_lockfile: RefCell<HashMap<path::PathBuf, Rc<Resolve>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should be plural as well

@@ -9,8 +9,8 @@
#[macro_use] extern crate log;
#[macro_use] extern crate lazy_static;

extern crate cargo;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to re-export cargo for rls?

Copy link
Collaborator Author

@kngwyu kngwyu May 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking that this could avoid getting two different versions of cargo included.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I misunderstood.
Yeah I think it's better if we can avoid duplicate too, but we need to discuss with @nrc or other RLS developer.


/// get module file from current path & crate name
pub fn get_module_file(name: &str, parentdir: &Path, session: &Session) -> Option<PathBuf> {
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the extra scopes seem unnecessary?

Copy link
Collaborator Author

@kngwyu kngwyu May 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so because now we can get src_paths of stdlib by cargo, but I hadn't tested it yet.

Copy link
Collaborator Author

@kngwyu kngwyu May 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some testing, I think we still need RUST_SRC_PATH for case we have no Cargo.toml file(i.e. when we're editing stand-alone *.rs file out of cargo project)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't mean the whole code, but only the additional {} around.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I misunderstood, I removed {} here.

Cargo.toml Outdated
@@ -21,9 +21,9 @@ doc = false
debug = true

[dependencies]
cargo = { git = "https://github.com/rust-lang/cargo", rev = "22c0f22f5621f1b68558d80a0966d073b9b68932" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes racer unpublishable, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no but I'm not sure, I have to check it.

libname, from_path
);

// we have to try 2 names(e.g. crate a-b has a target name a_b by rustc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's even worse, since the crate name may use both? e.g. prefix_some-stupid-name

There should be a mapping somewhere...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I had overlooked these cases!
Hmm... I'll look for these kind of functions in cargo.

let manifest = cargo_try!(find_root_manifest_for_wd(from_path));
debug!("[get_outer_crates] cache doesn't exist");
resolve_dependencies(&manifest, session, |name| {
if libname == name {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole closure can be

 libname == name || Some(name) == libname_hyphenated.as_ref().map(|x| &**x)

or

 libname == name || libname_hyphenated.as_ref().map_or(false, |x| x == name)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

bors added a commit to rust-lang/cargo that referenced this pull request May 27, 2018
Make add_overrides and get_resolved_packages pub

For racer-rust/racer#855.
Now  we're planning to use cargo to resolve dependencies, but for RLS we can't write `Cargo.lock`.
So we have to call `resolve_with_previous` directly, and want to use these functions.
@kngwyu
Copy link
Collaborator Author

kngwyu commented May 28, 2018

Now ready to merge

@matklad
Copy link
Contributor

matklad commented May 28, 2018

@kngwyu would it be possible to use cargo metadata sub command instead of linking to Cargo directly?

@matklad
Copy link
Contributor

matklad commented May 28, 2018

Hm, if cargo metadata updates Cargo.lock, that means that the current lockfile is outdated anyway, so it has to be updated before the compilation anyway.

But, for rls, which calls cargo internally, this can cause operation delay, because cargo waits until file lock is unlocked, though it will be at most 2~3s.

Note that writing lockfile can really take unbounded time: if the project has a lot of git dependencies, Cargo needs to clone all those repositories, and that can take a lot of time. That said, you'll need to clone those dependencies to provide correct completions.

I don't fully understand the constraints of racer and rls, but I imagine one way this could work is to have, inside racer, some kind of API "give me information about the project". Than, the stand-alone racer would use cargo metadata, and for racer inside rls, rls would provide this data using its own knowledge of the project (which it gets from Cargo). This way, standalone racer could avoid linking to Cargo, and for racer inside rls, we get the gurantee that rls is single source of truth.

@kngwyu
Copy link
Collaborator Author

kngwyu commented May 28, 2018

@matklad

I imagine one way this could work is to have, inside racer, some kind of API "give me information about the project".

Hmm...
Yeah it's better if we can avoid linking to cargo, but I'm concerned about overhead.
But, for RLS, indeed it's better to use its information to get src_paths.

@matklad
Copy link
Contributor

matklad commented May 28, 2018

but I'm concerned about overhead.

Yeah, that is a valid concern. Parsing lockfile and making sure it is up-to-date is not instantaneous, so it's better to cache cargo metadata output, if possible. On the other hand, I think the overhead of spawning a process and parsing JSON shouldn't be that big, compared to the work you actually need to do.

@kngwyu
Copy link
Collaborator Author

kngwyu commented May 29, 2018

Now I think to have different sources of src_paths (RLS and cargo metadata) is correct way, but I would like to set it as a future task, because I don't know much about RLS so I can't implement this kind of functionality immediately, and now we have many bugs to fix.
@matklad
Thanks for your whole advice, it's really helpful.

kngwyu added 12 commits June 1, 2018 16:53
Currently we use `env::set_current_dir` for test_project, but it
doesn't work with cargo. So I implemented new `with_test_project`
function which copies whole files into tempdir and then run `cargo` in
that directory. In addition, I replace racer`s original TmpDir with
`tempfile` crate and remove `sync!` macro in each tests.
1. add some comments
2. rename cargo_res cargo_try
3. modify to allocate hyphnated string only when it's necessary
And Implement load_lock_file to Session, which load lockfile
via FileLoader.
for rexporting package and patches.crates-io
To reduce build time of RLS
And some minor fixes
Fixed build breakage by rebase
And some minor fixes around logging
Currently when we're searching test_crate crate, we try to check
test_crate and test-crate against package names. But because it can't
handle corner cases like test_crate-one, changed strategy to check
test_crate2 against target-name.replace("-", "_").
In addition, this commit includes some minor fixes mainly around
naming and logging.
Somehow I forgot to remove it when I rebased
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants