-
Notifications
You must be signed in to change notification settings - Fork 348
RUST: Questions about the C API #738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @andreivasiliu – First, THANK YOU VERY MUCH for working on the initial Rust bindings and sorry for the long delay! As far as I can see you created the Rust bindings manually. Did you tried to auto generate the Rust bindings from the header files. I have played around with this and would like to get your feedback on this. Furthermore, there is a Scala Implementation of the same Unit API (Likewise in Go and NodeJS). Maybe we can find some answers to your questions while looking into this code. I would like to talk with @hongzhidao, @hongzhidao and @ac000 about your questions. Gentlemen, please feel free to pick a question and share your thoughts. Will do the same. The Rust bindings and the possibilities we will have with those are a great step into the right direction to a more widely adoption! looking forward to see this issue grow and be filled with a ton of useful information. |
They are created automatically, based on However, they are only used internally, since the generated bindings are very unsafe to use directly from Rust. The generated bindings use raw pointers that behave like C pointers; Rust code can only use these through the use of
Thank you very much! |
Got your point with the bindings and sorry for missing it in my inital review of your repo. So the goal is clear: |
I think the goal here is to have C-API (libunit) documented better. Correct me, if I'm wrong, but currently only source of information about C API is header file (nxt_unit.h) and a blog post about using Unit from assembly language (https://www.nginx.com/blog/nginx-unit-adds-assembly-language-support/). And of course current language bindings that are really hard to read. |
There is also nxt_unit_app_test.c which shows how to use the C API, which may answer some of the above questions. |
And this blog explains how another project created a Scala language module using libunit https://blog.indoorvivants.com/2022-03-05-twotm8-part-3-nginx-unit-and-fly.io-service |
Thanx for pointing to that direction, test cases could be useful. However, nxt_unit_app_test.c shows pretty basic usage. I'm more interested in proper implementation of backpressure during reading request body and writing response body, all test cases read or write data only once.
Zero is never returned neither NXT_UNIT_AGAIN (which I would expect). Interactions like this one should be documented... |
I suggest you try and see if it works. I wonder what benefit you'd get from doing that.
My guess is that you can probably do that. But I'm not sure, so you should try it.
Similarly as the above, probably yes.
Probably yes.
Probably yes.
For all of these questions, my guess is that as long as you have the context object, the thread in which you call things doesn't really matter. But again, I'm not sure, and you could try it.
I have only seen the data_handler() being used in Python code. I haven't investigated about it. Maybe @ac000 knows something about it.
I need to investigate more into this function. But from reading the source code, it seems
I think it is always non-blocking, yes.
Not really. This function returns the size of the read, which works similarly to read(2). It doesn't return NXT_UNIT_AGAIN.
In which sense is it incorrect?
In fact, you should rarely exit(3), I think. You should pthread_exit(3) or similar from each thread. Only if the entire app is in an inconsistent state you should suicide the entire app.
I hope the cleanup will be correct. If not, it's a bug in Unit. You should be able to rely on it. Otherwise, if a thread accidentally dies, the complete unitd could be compromised.
It probably needs to exist. I'd guess that if you done() the main one, and there's any other context object still alive, Unit should kill it, as mentioned right above.
Yes, that's supported. I tested it here:
Yes, that's supported. Also tested there.
Don't know. It's only explicitly used in the java module, and internally in nxt_unit.c. I dind't investigate that. If I learn it, I'll document it.
Yes. See for example the nxt_unit_app_test.c, which calls it in the
It doesn't look to me as a blocking variant. Look at the source code yourself and judge: $ grepc -tfd nxt_unit_response_write
./src/nxt_unit.c:2872:
int
nxt_unit_response_write(nxt_unit_request_info_t *req, const void *start,
size_t size)
{
ssize_t res;
res = nxt_unit_response_write_nb(req, start, size, size);
return res < 0 ? -res : NXT_UNIT_OK;
}
N/A
It doesn't block. It doesn't even send the buffer, actually. Unit will just put the buffers in the queue for sending. If there's a lot of traffic, it may even end up merging several chunks for a single send. See this test: $ echo -e 'GET / HTTP/1.1\r\nHost: _\n' | ncat localhost 80
HTTP/1.1 200 OK
Content-Type: text/plain
Server: Unit/1.30.0
Date: Sat, 20 May 2023 16:11:46 GMT
Transfer-Encoding: chunked
9b
Hello world!
Request data:
Method: GET
Protocol: HTTP/1.1
Remote addr: 127.0.0.1
Local addr: 127.0.0.1
Target: /
Path: /
Fields:
Host: _
6
barbaz
3
foo
0
I don't know; sorry. Maybe @ac000 ?
No idea. I can see that they are slightly different in the implementation, but they're so complex that I can't tell the actual difference without deep investigation. The commit logs that introduced them are silent about it, so no idea.
Technically yes, since these are just malloc(3) and free(3) wrappers, and the context is only used for logging. However, the log might then be confusing, but if you expect that one ctx mallocs and another one frees, you should be fine. However, I wonder why you'd do that.
You can think of it as Unit's
Sure. It's just a number. |
Many thanks!
Rust requires very strict memory and thread safety guarantees in its safe subset of the language. So in Rust, whenever wrapping foreign C code, the unsafe wrapping code (aka my bindings, in this case) must guarantee memory and thread safety under all circumstances, otherwise miscompilations can occur in the safe subset, as the Rust compiler does more assumptions there thanks to those guarantees. See the Sync, Send, and UnwindSafe markers for more details.
Ah, sorry, I meant whether I can use it as a meaningful return value from my callbacks, e.g. to tell Unit that it should call my callback again later, because I'm not ready yet, or Unit hasn't given me enough data yet. This is how Nginx handlers work if I remember correctly.
Using just pthread_exit() would be the cleanest, but then I would be permanently reducing the number of worker threads in the process. But I was more interested in the case where there is inconsistent state; the most common reason to use multi-threading is to share cache between threads, and if that gets corrupted, the entire process should exit. At that point, I have two options:
I see. So it is blocking, but only with regards to sending it between processes to Unit, via whatever mechanism that uses (pipe file descriptors and/or shared memory, I can't figure out which). This is important in Rust when creating asynchronous functions (returning Future), which are required to do no blocking I/O. This behavior also seems to be the same for reading request bodies; from my testing, by the time the app's request handler is called, the Unit server either has the entire request body data from the client, or none of it (i.e. it has just the header). I thought that reading the body data might hang until the client sends more data (which is not the case), or send NGX_AGAIN (which is also not the case). This also seems to mean that streaming body data from clients is impossible to do with a Unit app. The Unit server will either wait and buffer the entire body data from the client, or give up when the body exceeds 8MB (in which case the client gets a "Request too big" error, and the app never gets anything). |
I'll close this since I don't think I'm likely to get any more information here. Thank you for the assistance! |
@andreivasiliu I see that you also left comments in #891 We would like to have more generic support for Rust applications, and your work at https://github.com/andreivasiliu/unit-rs is very interesting. Perhaps we can feature this at the next Unit Community Call and help to make the Unit and Rust community aware of this effort? What are your plans for it? |
Hi @lcrilly, thank you for your interest. I'm currently not thinking of continuing my work there; I'll wait to see if the C API becomes fully documented, and may reconsider at that point. With that said, if you find someone else willing to work on it, or if you want ownership of the code I wrote, I'm definitely willing to help with tutoring, questions, and whatever else is needed. |
Thanks for the update. Will leave this open to collect more input about Rust and the C API. |
Hi! I'm trying to make some unofficial Rust bindings for
libunit.a
(see unit-rs), and I have some questions about the C API.Right now my bindings are pretty bad. It seems there's several things I got wrong and I might have several misconceptions that are making my bindings a lot more restrictive than they should be. Still, I am really impressed with the technical aspects of Unit and how it works, and it's been really fun to work with it, and I'd like to rework my bindings to better match the API's capabilities.
If anyone answers any of these questions, I'd like to improve the descriptions in the
nxt_unit.h
header. Would a PR for that be accepted?The questions:
Multi-threading and thread local storage
Assuming that contexts and requests are only accessed with a locked mutex, can Unit's C API functions be called from a different thread than the thread which created the context/request? In other words, do the context/request objects rely on thread-specific things like variables in thread-local-storage?
More specifically...
nxt_unit_init()
returns a context object that must then be destroyed withnxt_unit_done()
. Cannxt_unit_done()
be called from a different thread?nxt_unit_ctx_alloc()
creates a secondary context based on the main context. Cannxt_unit_done()
be called from a different thread than the one which created the context?Can
nxt_unit_run()
be called on a different thread than the one which created the context?The
request_handler()
callback will be called on the thread that runsnxt_unit_run()
, and it will be given a request object. Can methods that use this request object (likenxt_unit_response_send
,nxt_unit_response_buf_alloc
, etc) be called on a different thread than the one which received the request object?If I get a request from
nxt_unit_dequeue_request()
, can I send that request to a different thread and call API functions on it there?Request body streaming
From my experiments, Unit supports a max of 8MB bodies, buffers the whole body, and then calls this
data_handler()
callback at most once. Is that correct, or should I expect it to be called multiple times for slow-writing clients?Also from my experiments, if
data_handler()
is to be called, then before that, inrequest_handler()
, thenxt_unit_request_read()
API always returns 0 bytes. Is that always the case? Doesnxt_unit_request_read()
always return all or nothing? Or can I expect partial results?I don't see blocking/non-blocking variants for
nxt_unit_request_read()
. Can I safely assume thatnxt_unit_request_read()
is always non-blocking?Is the
NXT_UNIT_AGAIN
error code related in any way to the above?Is the
nxt_unit_app_test.c
example incorrect for requests with large request bodies?Clean shutdown
Let's say a thread wants to quit (e.g. it experienced a fatal error). Is my only option to
exit()
the process? Is there any way to trigger a graceful shutdown of this process, so that all other threads can finish whatever request they are handling, and then be given aQUIT
message?Also, what happens if
nxt_unit_done()
is called on the main context when there are still secondary contexts created from the main one? Will they cleanly shut down, or is this undefined behavior?Does the main context have to live for at least as long as the contexts spawned from it, or can it be
done
'd earlier?Request response buffers
Can
nxt_unit_response_buf_alloc()
be called multiple times before sending one of the buffers? In other words, can multiple buffers exist at the same time?Can I send response buffers in reverse order?
What is
nxt_unit_buf_next()
for? Does its result affectnxt_unit_buf_send()
in any way?Is it safe to call
nxt_unit_request_done()
on a request before sending or deallocating all of the buffers? If yes, will the buffers be automatically deallocated?Since there is a non-blocking version of
nxt_unit_response_write()
, then I assumenxt_unit_response_write()
is the blocking variant. When this blocks, the entire thread will be unavailable to process other requests. Is this vulnerable to clients with slow-reading, or will the Unit server accept and buffer the whole response even if the client doesn't read it?Does
nxt_unit_buf_send()
block? If yes, is it susceptible to slow-reading clients? Does it ever returnNXT_UNIT_AGAIN
?Misc questions
When is the
close_handler()
callback ever called? Is that only for websockets?How do
nxt_unit_run()
,nxt_unit_run_ctx()
, andnxt_unit_run_shared()
differ?If I call
nxt_unit_malloc()
on one context, can I callnxt_unit_free()
on a different context?What is
NXT_UNIT_AGAIN
for, and what returns this? Can I return or send this myself from anywhere?The text was updated successfully, but these errors were encountered: