-
Notifications
You must be signed in to change notification settings - Fork 13.3k
mDNS fails every on asynchronous thread - timing / synchro bug? #3263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The problem you are having is mostly because you are trying to call synchronous, blocking functions from asynchronous callbacks. I understand your need to start MDNS from a callback, but the current code is not compatible with that — if you look at the source of MDNSResponder::queryService, you will see a |
Thanks for the quick reply. It's disappointing but - I hear you. I had come close to that conclusion, but was looking for a) solution or b) confirmation. I now have b) (and a headache). It might be helpful if the docs were updated to help relative newcomers - like me - from taking up your time! I seriously appreciate all your work and that of the rest of the team. I'm back to the drawing board with nasty globals and polling loops on the main thread...sigh. Any plans for the service txt records? |
I've asked a few folks who are familiar with mDNS to review the changes in #3107. So far there haven't been any reviews, and I'm not following mDNS library closely enough to understand the impact of these changes. If the PR works for you, that's good, hopefully you can use it in your work :) Regarding 'nasty globals and polling loops' — you may consider making the modification to MDNS library, splitting Regarding documentation — thanks, noted. There are a few notes here and there about the functions which can not be called from asynchronous callbacks, but this specific case of 'mdns from WiFi events' isn't covered. Will add a note next time when working on the docs. |
I have another bug to fix in my PR #3107, so I might throw in a version of queryService() that doesn't have the delay(). I will still, however, block until the UDP mutlicast packets are sent, but I don't think that should be a problem. This leads to the question "Well, how do I get responses?". I've already implemented "setAnswerCallback()" that will invoke the callback function for every matching response. Please note that this callback also happens asynchronously (from the TCP/IP stack, I think) so the same limitations about delay() apply. FYI, when using my own code (with setAnswerCallback() ) I'm seeing fairly frequent Exception 28 failures...I'm still trying to debug those, so there may be further changes to my pull request as I work out any bugs. |
Basic Infos
Hardware
Hardware: Wemos D1 Mini
Core Version: ?2.1.0-rc2?
Description
mDNS fails every on asynchronous thread - timing / synchro bug?
Settings in IDE
Module: NodeMCU 0.9 (ESP12 Module)
(same behaviour with WEMOS D1 Mini)
Flash Size: 4MB/1MB
CPU Frequency: 80Mhz
Upload Using: SERIAL
Sketch
Debug Messages
AT first glance it looks like a really BAD way to do it - waiting for some arbitrary period and simply hoping that there is a connection - and I agree. I would never write such bad code except to demonstrate the problem:
The debug output shows that below 824ms, the "obvious" happens and we don't yet have an IP before we call MDNS and it is no surpise that it fails! It's not till T0+2283 that we get one - which seems like too long for me, but maybe MDNS "Waiting for answers" is somehow blocking completion of the autoconnect?
Let's try to increase the delay by 1ms and see what happens:
Success! At the "magic" T+823, we get an IP, and MDNS goes on to do exactly what was expected.
So we now have a way to make our code work: wait until we have an IP of course. I don't like "blocking" code, so let's do it the "proper" way and use the event handling code.
I have commented out "Section 2" and uncommented "Section 1" so we only call MDNS once we are absolutely certain we have an IP address. Then the machine state will be identical to the working ("T+824") situation and MDNS must work!
WRONG!!!
We get the IP at T+823, well before we call MDNS at T0+834, but for some reason, it fails.
If we compare the two Diag prints between "synchronous+delay" success at T0+824 with the failing asychronous call at T0+834, it is clear they are identical so why does the asynchronous one fail?
What is different "behind the scenes"? What is different in the "setup thread" and the "ticker thread"?
I modified the code to put the delay call
in "getMDNS" itself. The synchronous call works as before (why wouldn't it?) but the async version fries my brain!
Gives:
This is truly bizarre! the total code I call in "setup" now is:
So how can it be me that is delaying the "got IP" event? I don't even call asyncMDNS (which outputs "what does WL_CONNECTED mean?" at T+2833) until AFTER the wifiConnectHandler call at T+2823!
It's either a negative time coefficient in the flux capacitor or Serial.print is being "stacked up" or unserialised. Is this the same procfess that makes the the delay(FAILURE_LIMIT) last less than 1ms? I really need help on this.
I can't use a "workaround" - the whole of my project relies on the auto connect / asynchronous "got ip" event combination. I have tried about 10000 difrrent permutations but the MDSN call just will not work if it's not in the "setup" thread, no matter how much I try to delay or synchronise.
Help, please!
On a separate note, I also the project MUST have service TXT records for auto-discovery. I found a fork by mblythe86 which includes them, and it works perfectly - when MDNS itself works! When will this be in the production version?
Thanks,
Phil
The text was updated successfully, but these errors were encountered: