-
Notifications
You must be signed in to change notification settings - Fork 13.3k
WiFiClientSecure::available() blocks on dropped connection when TX buffer full #6434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the detailed description and logs and the proposed fix. Does it also fail with John Donne, or is this a Shakespeare-only issue? :) Why not submit a PR and get your name in the git logs? You've got a failing test and a proposed solution ready to go, which is really all you need! |
Thanks, I will experiment some more with it over the weekend and submit a
PR if happy.
…On Fri., 23 Aug. 2019, 2:13 am Earle F. Philhower, III, < ***@***.***> wrote:
Thanks for the detailed description and logs and the proposed fix. Does it
also fail with John Donne, or is this a Shakespeare-only issue? :)
Why not submit a PR and get your name in the git logs? You've got a
failing test and a proposed solution ready to go, which is really all you
need!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#6434?email_source=notifications&email_token=AMNOZ32OB7BZTLE7SF6ZQK3QF23MFA5CNFSM4IOV5O42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD45TEMQ#issuecomment-523973170>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMNOZ34JBCOD356SQSGH32TQF23MFANCNFSM4IOV5O4Q>
.
|
I have tested this some more and whipped up a simpler MCVE that does not require the MQTT client, just ugly HTTP String hacking. To reproduce it, you have to disconnect your internet in the first couple of minutes, because the postman echo server will close the connection after a couple of posts. Once the WiFiClient tx buffer fills, available() calls will block for 5 seconds. The 15 second block I was seeing in the previous example occoured because the MQTT client was calling connected() first, which calls available(), then calling available() twice. The below example is a better test, but in practice the if connected() while available() pattern is common and results in long bocks. #include <ESP8266WiFi.h>
#include <WiFiClientSecure.h>
const char* host = "postman-echo.com";
int port = 443; //https
WiFiClientSecure wifiClient;
unsigned long lastSend;
unsigned long sendInterval = 60000L;
void setup() {
wifiClient.setInsecure(); // disable validation of server certificate
Serial.begin(115200);
delay(1000);
WiFi.begin(/*ssid, password if needed */); //Connect to your WiFi router
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}
Serial.println("wifi connected");
if (!wifiClient.connect(host, port)) {
Serial.println("initial connection failed, restart to try again");
while (1) delay(1);
}
Serial.println("Secure socket connected to server. Disconnect internet without disconnecting wifi to demonstrate bad behaviour");
lastSend = millis();
}
void loop() {
if (millis() - lastSend > sendInterval) {
lastSend = millis();
String msg = "POST /post HTTP/1.1\r\n";
msg += "Host: " + String(host) + "\r\n";
msg += "Content-Type: application/x-www-form-urlencoded\r\n";
msg += "Content-Length: 121\r\n\r\n";
msg += "thequickbrownfoxjumpsoverthelazydogfjaoranckaeioavrjkanvjknvaeruivnaivnrakjnvkanvkanuarwefjrieavnkjanrlsnbiakjncoanranven\r\n";
msg += "Connection: keep-alive\r\n\r\n";
int afw = wifiClient.availableForWrite();
Serial.print("Sending a message, unencrypted message length: "); Serial.println(msg.length());
Serial.print("Bytes available for write in WiFiClient tx buffer: "); Serial.println(afw);
Serial.println(msg);
int msgLength = msg.length();
int written = wifiClient.print(msg);
Serial.print("request sent, bytes accepted by WiFiClientSecure: "); Serial.println(written);
Serial.println("\n\n");
}
unsigned long beforeAvailableCall = millis();
int avail = wifiClient.available();
unsigned long elapsed = millis() - beforeAvailableCall;
if (elapsed > 10) {
Serial.print("available() blocked for "); Serial.print(elapsed); Serial.println(" millis!");
}
for (int i = 0; i < avail; i++) {
byte b = wifiClient.read();
Serial.print( (char)b );
}
if (avail) Serial.println();
delay(1000);
}
Getting my head around how the _write and available methods work took quite a bit of head scratching, so I have brain-dumped my understanding below before I forgot. The WiFiClientSecure::_write loop first does a _run_until(BR_SSL_SENDAPP). Usually this will return success instantly. But if there is unsent record (encrypted) data hanging around from last time (state & BR_SSL_SENDREC == true), WiFiClient::write is called. If this data can not fit in the WiFiClient tx buffer, then it will block until space is available or timeout occours (default 5sec). If some write activity occurs and there is still space in the bearssl sendapp buffer (state & BR_SSL_SENDAPP == true) then _run_until returns success; if WiFiClient::write returns 0 _run_unit returns -1 and the _write loop is broken, signaling a write failure. WiFiClientSecure::_write then feeds the user's data into the bearssl sendapp buffer (as much as will fit), then calls br_ssl_engine_flush to force the engine to generate an encrypted record. Now state & BR_SSL_SENDREC == true but the data has not yet been written to the WiFiClient. So _write then calls flush(), which calls WiFiClientSecure::flush(0), which does a _run_until(BR_SSL_SENDAPP), causing the record data to be written to WiFiClient. WiFiClientSecure::flush then flushes the WiFiClient with the WiFiClient default timeout (300ms) WiFiClientSecure::_write does not really care what happens in the flush() method; it returns the number of bytes transfered to bearssl sendapp buffer. It only returns a fail (i.e. 0 or less than it was asked to send) when the leftover data from the previous write attempt or loop iteration can not be sent to the WiFiClient. This is fine but means that writes on dropped connections will not fail until two _write attempts are made on the full-up WiFiClient, or the sendapp buff fills, whichever occurs first. It might be more elegant to avoid the blocking-but-apparently-successful write, but that's not simple. WiFiClientSecure::available() calls _run_until(BR_SSL_RECVAPP, false), and this also triggers a blocking write attempt if there is still record data hanging around from previous write attempts (i.e. state & BR_SSL_SENDREC == true). The fix is to avoid writing to the WiFiClient more data than will fit in its tx buffer when the blocking flag == false; I will submit a PR with this change. *The 15sec block occours because WiFiClientSecure::_write has a call to connected() at the top, which in turn calls available(). |
Closing via #6449. |
Basic Infos
Platform
Settings in IDE
Problem Description
I've come across some unwanted behaviour in the WiFiClientSecure::available() and _run_until methods. This becomes apparent when sending MQTT messages of a few hundred bytes at relatively slow intervals (minutes) without regular keep-alive pings, and the connection drops somewhere between the access point and the server (i.e. wifi stays up but internet goes down).
The issue is as follows: when messages are sent but not ACK-ed, the TCP buffers begin to fill. When there is no longer enough space for a complete message, the write attempt will block for 10 seconds, but still report success. No problem yet, because it remains buffered.
At this point, subsequent calls to WiFiClientSecure::available() will block for 5-10 seconds, because the available() method calls _run_until, which then attempts to re-write the bytes remaining in the bearssl engine's buffer to the underlying WiFiClient. This will occour endlessly until a further write attempt is made from the application via WiFiClientSecure::write; at this point the connection will be closed after write timeout.
Example and debug log below. I am using the new ArduinoMqttClient, which I much prefer over the old PubSubClient, but this issue will occour with any WiFiClientSecure connection
MCVE Sketch
Debug Messages
I have patched the code as below, which seems to work, but I don't know this code well enough to know if it will cause other problems:
The text was updated successfully, but these errors were encountered: