Skip to content

WiFiClientSecure::available() blocks on dropped connection when TX buffer full #6434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
6 tasks
johnm545 opened this issue Aug 22, 2019 · 4 comments
Closed
6 tasks

Comments

@johnm545
Copy link
Contributor

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: ESP-12E
  • Core Version: 2.5.2
  • Development Env: Arduino IDE
  • Operating System: Windows

Settings in IDE

  • Module: NodeMCU
  • Flash Mode: [qio|dio|other]
  • Flash Size: [4MB/1MB]
  • lwip Variant: v2 Lower Memory
  • Reset Method: nodemcu
  • Flash Frequency: [40Mhz]
  • CPU Frequency: 80Mhz
  • Upload Using: SERIAL
  • Upload Speed: 115200

Problem Description

I've come across some unwanted behaviour in the WiFiClientSecure::available() and _run_until methods. This becomes apparent when sending MQTT messages of a few hundred bytes at relatively slow intervals (minutes) without regular keep-alive pings, and the connection drops somewhere between the access point and the server (i.e. wifi stays up but internet goes down).

The issue is as follows: when messages are sent but not ACK-ed, the TCP buffers begin to fill. When there is no longer enough space for a complete message, the write attempt will block for 10 seconds, but still report success. No problem yet, because it remains buffered.

At this point, subsequent calls to WiFiClientSecure::available() will block for 5-10 seconds, because the available() method calls _run_until, which then attempts to re-write the bytes remaining in the bearssl engine's buffer to the underlying WiFiClient. This will occour endlessly until a further write attempt is made from the application via WiFiClientSecure::write; at this point the connection will be closed after write timeout.

Example and debug log below. I am using the new ArduinoMqttClient, which I much prefer over the old PubSubClient, but this issue will occour with any WiFiClientSecure connection

MCVE Sketch

#include <ESP8266WiFi.h>
#include <ArduinoMqttClient.h>

const char* msg = R"(To be, or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles
And by opposing end them. To die—to sleep,
No more; and by a sleep to say we end
The heart-ache and the thousand natural shocks
That flesh is heir to: 'tis a consummation
Devoutly to be wish'd. To die, to sleep;
To sleep, perchance to dream—ay, there's the rub:
For in that sleep of death what dreams may come,
When we have shuffled off this mortal coil,
Must give us pause—there's the respect
That makes calamity of so long life)";

WiFiClientSecure wifiClient;
MqttClient mqttClient(wifiClient);

const char broker[] = "test.mosquitto.org";
int        port     = 8883;
const char topic[]  = "arduino/simple";

const long interval = 60000L; // send every minute
unsigned long previousMillis = 0;

unsigned long loopStartMillis = 0;

int count = 0;

void setup() {
  Serial.begin(115200);
  delay(1000);

  wifiClient.setInsecure(); // disable validation of server certificate
  mqttClient.setKeepAliveInterval(10*60*1000L); // long keepalive interval to simplify this example

  while (WiFi.begin(/*ssid, pass if needed*/) != WL_CONNECTED) {
    Serial.print(".");
    delay(5000);
  }

  Serial.println("Wifi connected");
  Serial.println();

  Serial.print("Attempting to connect to the MQTT broker: ");
  Serial.println(broker);

  if (!mqttClient.connect(broker, port)) {
    Serial.print("Reboot me, MQTT connection failed! Error code = ");
    Serial.println(mqttClient.connectError());
    while(1) delay(1);
  }

  Serial.println("Connected to the MQTT broker. To trigger bad behaviour, disconnect internet without disconnecting wifi");
  Serial.println();
}

void loop() {
  loopStartMillis = millis();

  // poll checks for incoming messages via client.available(), which in this case resolves to WiFiClientSecure::available()
  mqttClient.poll();
  unsigned long pollTimeTaken = millis() - loopStartMillis;
  if (pollTimeTaken > 1) {
    Serial.print("loop() was blocked for ");
    Serial.print(pollTimeTaken);
    Serial.println(" millis due to WiFiClientSecure::available() calls!");
  }

  unsigned long currentMillis = millis();

  if (currentMillis - previousMillis >= interval) {
    // save the last time a message was sent
    previousMillis = currentMillis;

    int afw = wifiClient.availableForWrite();
    Serial.print("wificlient tx buffer bytes available: ");
    Serial.println(afw);

    Serial.print("Sending message ");
    Serial.println(count);

    Serial.println(msg);

    // send message at QOS 0
    mqttClient.beginMessage(topic);
    mqttClient.print(msg);
    mqttClient.print(count);
    bool result = mqttClient.endMessage();

    if(result) {
      Serial.println("TCP stack accepted the message");
    } else {
      Serial.println("Message not accepted");
    }
    Serial.println();

    count++;
  }
}

Debug Messages

23:51:48.377 -> SDK:2.2.2-dev(38a443e)/Core:2.5.2-108-ge77f96c3=20502108/lwIP:STABLE-2_1_2_RELEASE/glue:1.1-8-g2314329/BearSSL:89454af
23:51:48.377 -> wifi evt: 2
23:51:48.470 -> scandone
23:51:48.470 -> state: 0 -> 2 (b0)
23:51:48.470 -> state: 2 -> 3 (0)
23:51:48.470 -> state: 3 -> 5 (10)
23:51:48.517 -> add 0
23:51:48.517 -> aid 4
23:51:48.517 -> cnt 
23:51:48.564 -> 
23:51:48.564 -> connected with <redacted>
23:51:48.564 -> dhcp client start...
23:51:48.564 -> wifi evt: 0
23:51:49.361 -> .scandone
23:51:50.673 -> ip:192.168.20.11,mask:255.255.255.0,gw:192.168.20.1
23:51:50.673 -> wifi evt: 3
23:51:54.377 -> .scandone
23:51:58.502 -> pm open,type:2 0
23:51:59.392 -> Wifi connected
23:51:59.392 -> 
23:51:59.392 -> Attempting to connect to the MQTT broker: test.mosquitto.org
23:51:59.392 -> [hostByName] request IP for: test.mosquitto.org
23:51:59.392 -> sl
23:51:59.486 -> scandone
23:51:59.486 -> usl
23:51:59.674 -> [hostByName] Host: test.mosquitto.org IP: 5.196.95.208
23:51:59.674 -> :ref 1
23:52:00.424 -> BSSL:_connectSSL: start connection
23:52:00.471 -> :wr 223 0
23:52:00.471 -> :wrc 223 223 0
23:52:00.939 -> :ack 223
23:52:00.939 -> :rn 536
23:52:00.939 -> :rd 5, 536, 0
23:52:00.939 -> :rdi 536, 5
23:52:00.939 -> :rch 536, 536
23:52:00.939 -> :rd 94, 1072, 5
23:52:00.939 -> :rdi 531, 94
23:52:00.939 -> :rd 5, 1072, 99
23:52:00.939 -> :rdi 437, 5
23:52:00.939 -> :rd 512, 1072, 104
23:52:00.939 -> :rdi 432, 432
23:52:00.939 -> :c 432, 536, 1072
23:52:00.939 -> :rdi 536, 80
23:52:00.939 -> BSSL:CERT: 30 82 04 84 30 82 03 ed a0 03 02 01 02 02 09 00 d9 97 24 19 9d 9c 4d f2 30 0d 06 09 2a 86 48 86 f7 0d 01 01 0b 05 00 30 81 90 31 0b 30 09 06 03 55 04 06 13 02 47 42 31 17 30 15 06 03 55 04 08 0c 0e 55 6e 69 74 65 64 20 4b 69 6e 67 64 6f 6d 31 0e 30 0c 06 03 55 04 07 0c 05 44 65 72 62 79 31 12 30 10 06 03 55 04 0a 0c 09 4d 6f 73 71 75 69 74 74 6f 31 0b 30 09 06 03 55 04 0b 0c 02 43 41 31 16 30 14 06 03 55 04 03 0c 0d 6d 6f 73 71 75 69 74 74 6f 2e 6f 72 67 31 1f 30 1d 06 09 2a 86 48 86 f7 0d 01 09 01 16 10 72 6f 67 65 72 40 61 74 63 68 6f 6f 2e 6f 72 67 30 1e 17 0d 31 39 30 33 32 30 31 34 32 31 33 39 5a 17 0d 33 32 30 33 31 36 31 34 32 31 33 39 5a 30 7f 31 0b 30 09 06 03 55 04 06 13 02 47 42 31 17 30 15 06 03 55 04 08 0c 0e 55 6e 69 74 65 64 20 4b 69 6e 67 64 
23:52:01.033 -> :rd 5, 536, 80
23:52:01.033 -> :rdi 456, 5
23:52:01.033 -> :rd 451, 536, 85
23:52:01.033 -> :rdi 451, 451
23:52:01.033 -> :c0 451, 536
23:52:01.033 -> BSSL:CERT: 6f 6d 31 0e 30 0c 06 03 55 04 07 0c 05 44 65 72 62 79 31 12 30 10 06 03 55 04 0a 0c 09 4d 6f 73 71 75 69 74 74 6f 31 16 30 14 06 03 55 04 0b 0c 0d 50 75 62 6c 69 63 20 73 65 72 76 65 72 31 1b 30 19 06 03 55 04 03 0c 12 74 65 73 74 2e 6d 6f 73 71 75 69 74 74 6f 2e 6f 72 67 30 82 01 22 30 0d 06 09 2a 86 48 86 f7 0d 01 01 01 05 00 03 82 01 0f 00 30 82 01 0a 02 82 01 01 00 bd b0 f3 65 e9 50 31 a1 3c 48 6d c8 ea c0 13 80 45 56 48 36 28 53 93 27 dd 12 45 76 16 7a 10 92 0c 1b 04 28 32 2c ad 1f 47 44 97 21 f9 99 ea a0 f6 e2 e0 04 1c 9c 5d d9 e6 bb 47 0b 37 8e 28 44 66 9d b1 a8 ef 61 8d 9d d8 cc d9 af 73 70 66 b6 8e 0d a4 4a 18 ba b7 3b 8b 09 a7 48 50 34 96 e2 d7 66 d8 62 f4 02 a7 76 33 da d0 5f 79 b3 74 a8 e1 71 be 55 53 05 1e c5 75 91 10 7c a3 b2 69 76 76 d2 e4 be 
23:52:01.127 -> BSSL:CERT: a7 87 e1 d5 c2 5a 5d 4f 32 1a 31 8e a5 74 03 60 cd 08 24 ba b0 39 ca 48 80 e2 aa 99 c5 7c a7 03 4b cd bb 71 62 6c f1 43 90 2e 3f 3a bb 88 f9 ca ad 5c 28 5a f4 8f dc be 44 a3 ce 80 76 9c 95 96 1f bb c9 c8 32 08 13 76 c8 86 0f 77 c9 5e ad 1a 89 33 72 63 84 8e cb 3e 71 7b af d7 04 ee 46 02 93 1d 77 be 2c c1 75 0c 95 9f 76 e1 70 0f e2 ff 8f 36 c6 4c 7f 55 04 7c d2 8c ba c9 02 03 01 00 01 a3 82 01 70 30 82 01 6c 30 0c 06 03 55 1d 13 01 01 ff 04 02 30 00 30 11 06 09 60 86 48 01 86 f8 42 01 01 04 04 03 02 06 40 30 0b 06 03 55 1d 0f 04 04 03 02 05 e0 30 21 06 09 60 86 48 01 86 f8 42 01 0d 04 14 16 12 42 72 6f 6b 65 72 20 43 65 72 74 69 66 69 63 61 74 65 30 1d 06 03 55 1d 0e 04 16 04 14 82 fb 32 4b 74 fd e4 fc 23 b1 19 87 70 52 db e2 ca 81 ae ff 30 81 c5 06 03 55 1d 
23:52:01.361 -> :rn 1072
23:52:01.361 -> :rd 61, 1072, 0
23:52:01.361 -> :rdi 536, 61
23:52:01.361 -> :rd 5, 1072, 61
23:52:01.361 -> :rdi 475, 5
23:52:01.361 -> :rd 512, 1072, 66
23:52:01.361 -> :rdi 470, 470
23:52:01.361 -> :c 470, 536, 1072
23:52:01.361 -> :rdi 536, 42
23:52:01.361 -> BSSL:CERT: 23 04 81 bd 30 81 ba 80 14 da 77 64 27 79 5c ed 20 f4 33 45 11 a3 e9 91 ca a8 94 ef e6 a1 81 96 a4 81 93 30 81 90 31 0b 30 09 06 03 55 04 06 13 02 47 42 31 17 30 15 06 03 55 04 08 0c 0e 55 6e 69 74 65 64 20 4b 69 6e 67 64 6f 6d 31 0e 30 0c 06 03 55 04 07 0c 05 44 65 72 62 79 31 12 30 10 06 03 55 04 0a 0c 09 4d 6f 73 71 75 69 74 74 6f 31 0b 30 09 06 03 55 04 0b 0c 02 43 41 31 16 30 14 06 03 55 04 03 0c 0d 6d 6f 73 71 75 69 74 74 6f 2e 6f 72 67 31 1f 30 1d 06 09 2a 86 48 86 f7 0d 01 09 01 16 10 72 6f 67 65 72 40 61 74 63 68 6f 6f 2e 6f 72 67 82 09 00 e0 fa dc f9 57 8c 98 bc 30 32 06 03 55 1d 11 04 2b 30 29 82 12 74 65 73 74 2e 6d 6f 73 71 75 69 74 74 6f 2e 6f 72 67 82 13 74 65 73 74 36 2e 6d 6f 73 71 75 69 74 74 6f 2e 6f 72 67 30 0d 06 09 2a 86 48 86 f7 0d 01 
23:52:01.455 -> BSSL:CERT: 01 0b 05 00 03 81 81 00 3b bd 66 e8 9e d8 ec 34 3f c1 ab 46 e9 5e 21 be 38 7a 8f fa 7e b1 93 b9 0b f8 12 ee f2 05 b3 72 8e 09 1f b9 40 a0 b7 49 2b 41 ae 42 9d 7f 3f b6 37 0e 04 ef d9 f6 40 08 7f f0 6b d0 2f a9 3c 71 2b 7e 7b b6 2e 5f 7d 1c dc 24 9e 37 7a 7c f0 8a e4 8d 58 4e 61 b6 ab b6 2b 44 25 fa 75 ee ba e9 68 d7 d8 8d 43 90 c2 c5 b2 d5 ac 27 47 72 13 e7 96 a2 63 c7 89 e6 ea e4 83 a5 13 69 00 1f 2d 97 
23:52:01.455 -> :rd 5, 536, 42
23:52:01.455 -> :rdi 494, 5
23:52:01.455 -> :rd 393, 536, 47
23:52:01.455 -> :rdi 489, 393
23:52:01.502 -> :rd 5, 536, 440
23:52:01.502 -> :rdi 96, 5
23:52:01.502 -> :rd 91, 536, 445
23:52:01.502 -> :rdi 91, 91
23:52:01.502 -> :c0 91, 536
23:52:04.502 -> :rn 218
23:52:04.502 -> :rd 209, 218, 0
23:52:04.502 -> :rdi 218, 209
23:52:04.689 -> :rd 5, 218, 209
23:52:04.689 -> :rdi 9, 5
23:52:04.689 -> :rd 4, 218, 214
23:52:04.689 -> :rdi 4, 4
23:52:04.689 -> :c0 4, 218
23:52:04.971 -> :wr 42 0
23:52:04.971 -> :wrc 42 42 0
23:52:04.971 -> :wr 6 0
23:52:04.971 -> :wrc 6 6 0
23:52:04.971 -> :wr 45 0
23:52:04.971 -> :wrc 45 45 0
23:52:05.533 -> :ack 42
23:52:06.049 -> :ack 51
23:52:06.049 -> :rn 51
23:52:06.049 -> :rd 5, 51, 0
23:52:06.049 -> :rdi 51, 5
23:52:06.049 -> :rd 1, 51, 5
23:52:06.049 -> :rdi 46, 1
23:52:06.049 -> :rd 5, 51, 6
23:52:06.049 -> :rdi 45, 5
23:52:06.049 -> :rd 40, 51, 11
23:52:06.049 -> :rdi 40, 40
23:52:06.049 -> :c0 40, 51
23:52:06.049 -> BSSL:Connected!
23:52:06.049 -> :wr 59 0
23:52:06.049 -> :wrc 59 59 0
23:52:06.377 -> :wustmo
23:52:06.564 -> :ack 59
23:52:06.564 -> :rn 33
23:52:06.564 -> :rd 5, 33, 0
23:52:06.564 -> :rdi 33, 5
23:52:06.564 -> :rd 28, 33, 5
23:52:06.564 -> :rdi 28, 28
23:52:06.564 -> :c0 28, 33
23:52:06.564 -> Connected to the MQTT broker. To trigger bad behaviour, disconnect internet without disconnecting wifi
23:52:06.564 -> 
23:52:48.298 -> wificlient tx buffer bytes available: 1072
23:52:48.298 -> Sending message 0
23:52:48.298 -> To be, or not to be, that is the question:
23:52:48.298 -> Whether 'tis nobler in the mind to suffer
23:52:48.298 -> The slings and arrows of outrageous fortune,
23:52:48.345 -> Or to take arms against a sea of troubles
23:52:48.345 -> And by opposing end them. To die—to sleep,
23:52:48.345 -> No more; and by a sleep to say we end
23:52:48.345 -> The heart-ache and the thousand natural shocks
23:52:48.345 -> That flesh is heir to: 'tis a consummation
23:52:48.345 -> Devoutly to be wish'd. To die, to sleep;
23:52:48.345 -> To sleep, perchance to dream—ay, there's the rub:
23:52:48.345 -> For in that sleep of death what dreams may come,
23:52:48.345 -> When we have shuffled off this mortal coil,
23:52:48.345 -> Must give us pause—there's the respect
23:52:48.345 -> That makes calamity of so long life
23:52:48.345 -> :wr 48 0
23:52:48.345 -> :wrc 48 48 0
23:52:48.673 -> :wustmo
23:52:48.673 -> :wr 285 0
23:52:48.673 -> :wrc 285 285 0
23:52:48.954 -> :wustmo
23:52:48.954 -> TCP stack accepted the message
23:52:49.001 -> 
23:52:49.048 -> :ack 48
23:52:49.563 -> :ack 285
23:53:48.319 -> wificlient tx buffer bytes available: 1072
23:53:48.319 -> Sending message 1
23:53:48.319 -> To be, or not to be, that is the question:
23:53:48.319 -> Whether 'tis nobler in the mind to suffer
23:53:48.319 -> The slings and arrows of outrageous fortune,
23:53:48.319 -> Or to take arms against a sea of troubles
23:53:48.319 -> And by opposing end them. To die—to sleep,
23:53:48.319 -> No more; and by a sleep to say we end
23:53:48.319 -> The heart-ache and the thousand natural shocks
23:53:48.319 -> That flesh is heir to: 'tis a consummation
23:53:48.366 -> Devoutly to be wish'd. To die, to sleep;
23:53:48.366 -> To sleep, perchance to dream—ay, there's the rub:
23:53:48.366 -> For in that sleep of death what dreams may come,
23:53:48.366 -> When we have shuffled off this mortal coil,
23:53:48.366 -> Must give us pause—there's the respect
23:53:48.366 -> That makes calamity of so long life
23:53:48.366 -> :wr 48 0
23:53:48.366 -> :wrc 48 48 0
23:53:48.647 -> :wustmo
23:53:48.694 -> :wr 285 0
23:53:48.694 -> :wrc 285 285 0
23:53:48.975 -> :wustmo
23:53:48.975 -> TCP stack accepted the message
23:53:48.975 -> 
23:54:48.296 -> wificlient tx buffer bytes available: 739
23:54:48.296 -> Sending message 2
23:54:48.296 -> To be, or not to be, that is the question:
23:54:48.296 -> Whether 'tis nobler in the mind to suffer
23:54:48.342 -> The slings and arrows of outrageous fortune,
23:54:48.342 -> Or to take arms against a sea of troubles
23:54:48.342 -> And by opposing end them. To die—to sleep,
23:54:48.342 -> No more; and by a sleep to say we end
23:54:48.342 -> The heart-ache and the thousand natural shocks
23:54:48.342 -> That flesh is heir to: 'tis a consummation
23:54:48.342 -> Devoutly to be wish'd. To die, to sleep;
23:54:48.342 -> To sleep, perchance to dream—ay, there's the rub:
23:54:48.342 -> For in that sleep of death what dreams may come,
23:54:48.342 -> When we have shuffled off this mortal coil,
23:54:48.342 -> Must give us pause—there's the respect
23:54:48.342 -> That makes calamity of so long life
23:54:48.389 -> :wr 48 0
23:54:48.389 -> :wrc 48 48 0
23:54:48.671 -> :wustmo
23:54:48.671 -> :wr 285 0
23:54:48.671 -> :wrc 285 285 0
23:54:48.999 -> :wustmo
23:54:48.999 -> TCP stack accepted the message
23:54:48.999 -> 
23:55:48.325 -> wificlient tx buffer bytes available: 406
23:55:48.325 -> Sending message 3
23:55:48.325 -> To be, or not to be, that is the question:
23:55:48.325 -> Whether 'tis nobler in the mind to suffer
23:55:48.325 -> The slings and arrows of outrageous fortune,
23:55:48.325 -> Or to take arms against a sea of troubles
23:55:48.325 -> And by opposing end them. To die—to sleep,
23:55:48.325 -> No more; and by a sleep to say we end
23:55:48.325 -> The heart-ache and the thousand natural shocks
23:55:48.325 -> That flesh is heir to: 'tis a consummation
23:55:48.325 -> Devoutly to be wish'd. To die, to sleep;
23:55:48.325 -> To sleep, perchance to dream—ay, there's the rub:
23:55:48.372 -> For in that sleep of death what dreams may come,
23:55:48.372 -> When we have shuffled off this mortal coil,
23:55:48.372 -> Must give us pause—there's the respect
23:55:48.372 -> That makes calamity of so long life
23:55:48.372 -> :wr 48 0
23:55:48.372 -> :wrc 48 48 0
23:55:48.653 -> :wustmo
23:55:48.653 -> :wr 285 0
23:55:48.653 -> :wrc 285 285 0
23:55:48.981 -> :wustmo
23:55:48.981 -> TCP stack accepted the message
23:55:48.981 -> 
23:56:48.313 -> wificlient tx buffer bytes available: 73
23:56:48.313 -> Sending message 4
23:56:48.313 -> To be, or not to be, that is the question:
23:56:48.313 -> Whether 'tis nobler in the mind to suffer
23:56:48.313 -> The slings and arrows of outrageous fortune,
23:56:48.313 -> Or to take arms against a sea of troubles
23:56:48.313 -> And by opposing end them. To die—to sleep,
23:56:48.313 -> No more; and by a sleep to say we end
23:56:48.360 -> The heart-ache and the thousand natural shocks
23:56:48.360 -> That flesh is heir to: 'tis a consummation
23:56:48.360 -> Devoutly to be wish'd. To die, to sleep;
23:56:48.360 -> To sleep, perchance to dream—ay, there's the rub:
23:56:48.360 -> For in that sleep of death what dreams may come,
23:56:48.360 -> When we have shuffled off this mortal coil,
23:56:48.360 -> Must give us pause—there's the respect
23:56:48.360 -> That makes calamity of so long life
23:56:48.360 -> :wr 48 0
23:56:48.360 -> :wrc 48 48 0
23:56:48.688 -> :wustmo
23:56:48.688 -> :wr 285 0
23:56:48.688 -> :wrc 25 285 0
23:56:48.688 -> :wr 260 25
23:56:49.204 -> :wr 260 25
23:56:49.691 -> :wr 260 25
23:56:50.205 -> :wr 260 25
23:56:50.721 -> :wr 260 25
23:56:51.236 -> :wr 260 25
23:56:51.752 -> :wr 260 25
23:56:52.268 -> :wr 260 25
23:56:52.783 -> :wr 260 25
23:56:53.299 -> :wr 260 25
23:56:53.814 -> :wr 260 25
23:56:53.814 -> :wtmo
23:56:53.814 -> :wr 260 0
23:56:54.330 -> :wr 260 0
23:56:54.846 -> :wr 260 0
23:56:55.361 -> :wr 260 0
23:56:55.877 -> :wr 260 0
23:56:56.393 -> :wr 260 0
23:56:56.861 -> :wr 260 0
23:56:57.377 -> :wr 260 0
23:56:57.893 -> :wr 260 0
23:56:58.408 -> :wr 260 0
23:56:58.924 -> :wr 260 0
23:56:58.924 -> :wtmo
23:56:59.205 -> :wustmo
23:56:59.205 -> TCP stack accepted the message
23:56:59.205 -> 
23:56:59.205 -> :wr 260 0
23:56:59.439 -> :wr 260 0
23:57:00.002 -> :wr 260 0
23:57:00.518 -> :wr 260 0
23:57:01.033 -> :wr 260 0
23:57:01.549 -> :wr 260 0
23:57:02.064 -> :wr 260 0
23:57:02.580 -> :wr 260 0
23:57:03.143 -> :wr 260 0
23:57:03.658 -> :wr 260 0
23:57:04.174 -> :wr 260 0
23:57:04.689 -> :wr 260 0
23:57:04.689 -> :wtmo
23:57:04.689 -> :wr 260 0
23:57:05.205 -> :wr 260 0
23:57:05.768 -> :wr 260 0
23:57:06.283 -> :wr 260 0
23:57:06.799 -> :wr 260 0
23:57:07.314 -> :wr 260 0
23:57:07.830 -> :wr 260 0
23:57:08.346 -> :wr 260 0
23:57:08.814 -> :wr 260 0
23:57:09.330 -> :wr 260 0
23:57:09.892 -> :wr 260 0
23:57:09.892 -> :wtmo
23:57:09.892 -> :wr 260 0
23:57:10.361 -> :wr 260 0
23:57:10.877 -> :wr 260 0
23:57:11.393 -> :wr 260 0
23:57:11.861 -> :wr 260 0
23:57:12.377 -> :wr 260 0
23:57:12.893 -> :wr 260 0
23:57:13.361 -> :wr 260 0
23:57:13.877 -> :wr 260 0
23:57:14.393 -> :wr 260 0
23:57:14.908 -> :wr 260 0
23:57:14.908 -> :wtmo
23:57:14.908 -> loop() was blocked for 15696 millis due to WiFiClientSecure::available() calls!
23:57:14.908 -> :wr 260 0
23:57:15.424 -> :wr 260 0
23:57:15.986 -> :wr 260 0
23:57:16.502 -> :wr 260 0
23:57:17.064 -> :wr 260 0
23:57:17.580 -> :wr 260 0
23:57:18.143 -> :wr 260 0
23:57:18.658 -> :wr 260 0
23:57:19.174 -> :wr 260 0
23:57:19.689 -> :wr 260 0
23:57:20.205 -> :wr 260 0
23:57:20.205 -> :wtmo
23:57:20.205 -> :wr 260 0
23:57:20.721 -> :wr 260 0
23:57:21.189 -> :wr 260 0
23:57:21.705 -> :wr 260 0
23:57:22.221 -> :wr 260 0
23:57:22.736 -> :wr 260 0
23:57:23.252 -> :wr 260 0
23:57:23.768 -> :wr 260 0
23:57:24.330 -> :wr 260 0
23:57:24.846 -> :wr 260 0
23:57:25.361 -> :wr 260 0
23:57:25.361 -> :wtmo
23:57:25.361 -> :wr 260 0
23:57:25.877 -> :wr 260 0
23:57:26.392 -> :wr 260 0
23:57:26.861 -> :wr 260 0
23:57:27.377 -> :wr 260 0
23:57:27.892 -> :wr 260 0
23:57:28.361 -> :wr 260 0
23:57:28.877 -> :wr 260 0
23:57:29.392 -> :wr 260 0
23:57:29.861 -> :wr 260 0
23:57:30.377 -> :wr 260 0
23:57:30.377 -> :wtmo
23:57:30.377 -> loop() was blocked for 15474 millis due to WiFiClientSecure::available() calls!
23:57:30.377 -> :wr 260 0
23:57:30.939 -> :wr 260 0
23:57:31.455 -> :wr 260 0
23:57:32.018 -> :wr 260 0
23:57:32.533 -> :wr 260 0
23:57:33.096 -> :wr 260 0
23:57:33.611 -> :wr 260 0
23:57:34.127 -> :wr 260 0
23:57:34.643 -> :wr 260 0
23:57:35.158 -> :wr 260 0
23:57:35.721 -> :wr 260 0
23:57:35.721 -> :wtmo
23:57:35.721 -> :wr 260 0
23:57:36.189 -> :wr 260 0
23:57:36.705 -> :wr 260 0
23:57:37.221 -> :wr 260 0
23:57:37.736 -> :wr 260 0
23:57:38.252 -> :wr 260 0
23:57:38.767 -> :wr 260 0
23:57:39.236 -> :wr 260 0
23:57:39.799 -> :wr 260 0
23:57:40.314 -> :wr 260 0
23:57:40.783 -> :wr 260 0
23:57:40.783 -> :wtmo
23:57:40.783 -> :wr 260 0
23:57:41.299 -> :wr 260 0
23:57:41.814 -> :wr 260 0
23:57:42.283 -> :wr 260 0
23:57:42.799 -> :wr 260 0
23:57:43.314 -> :wr 260 0
23:57:43.783 -> :wr 260 0
23:57:44.299 -> :wr 260 0
23:57:44.814 -> :wr 260 0
23:57:45.283 -> :wr 260 0
23:57:45.799 -> :wr 260 0
23:57:46.314 -> :wr 260 0
23:57:46.314 -> :wtmo
23:57:46.314 -> loop() was blocked for 15949 millis due to WiFiClientSecure::available() calls!
23:57:46.314 -> :wr 260 0
23:57:46.830 -> :wr 260 0
23:57:47.392 -> :wr 260 0
23:57:47.908 -> :wr 260 0
23:57:48.471 -> :wr 260 0
23:57:48.986 -> :wr 260 0
23:57:49.502 -> :wr 260 0
23:57:50.017 -> :wr 260 0
23:57:50.533 -> :wr 260 0
23:57:51.096 -> :wr 260 0
23:57:51.611 -> :wr 260 0
23:57:51.611 -> :wtmo
23:57:51.611 -> :wr 260 0
23:57:52.127 -> :wr 260 0
23:57:52.643 -> :wr 260 0
23:57:53.158 -> :wr 260 0
23:57:53.721 -> :wr 260 0
23:57:54.189 -> :wr 260 0
23:57:54.705 -> :wr 260 0
23:57:55.221 -> :wr 260 0
23:57:55.736 -> :wr 260 0
23:57:56.205 -> :wr 260 0
23:57:56.767 -> :wr 260 0
23:57:56.767 -> :wtmo
23:57:56.767 -> :wr 260 0
23:57:57.236 -> :wr 260 0
23:57:57.752 -> :wr 260 0
23:57:58.267 -> :wr 260 0
23:57:58.783 -> :wr 260 0
23:57:59.252 -> :wr 260 0
23:57:59.767 -> :wr 260 0
23:58:00.283 -> :wr 260 0
23:58:00.752 -> :wr 260 0
23:58:01.268 -> :wr 260 0
23:58:01.783 -> :wr 260 0
23:58:01.783 -> :wtmo
23:58:01.783 -> loop() was blocked for 15473 millis due to WiFiClientSecure::available() calls!
23:58:01.830 -> wificlient tx buffer bytes available: 0
23:58:01.830 -> Sending message 5
23:58:01.830 -> To be, or not to be, that is the question:
23:58:01.830 -> Whether 'tis nobler in the mind to suffer
23:58:01.830 -> The slings and arrows of outrageous fortune,
23:58:01.830 -> Or to take arms against a sea of troubles
23:58:01.830 -> And by opposing end them. To die—to sleep,
23:58:01.830 -> No more; and by a sleep to say we end
23:58:01.830 -> The heart-ache and the thousand natural shocks
23:58:01.830 -> That flesh is heir to: 'tis a consummation
23:58:01.830 -> Devoutly to be wish'd. To die, to sleep;
23:58:01.830 -> To sleep, perchance to dream—ay, there's the rub:
23:58:01.830 -> For in that sleep of death what dreams may come,
23:58:01.877 -> When we have shuffled off this mortal coil,
23:58:01.877 -> Must give us pause—there's the respect
23:58:01.877 -> That makes calamity of so long life
23:58:01.877 -> :wr 260 0
23:58:02.346 -> :wr 260 0
23:58:02.861 -> :wr 260 0
23:58:03.377 -> :wr 260 0
23:58:03.939 -> :wr 260 0
23:58:04.455 -> :wr 260 0
23:58:04.971 -> :wr 260 0
23:58:05.486 -> :wr 260 0
23:58:06.002 -> :wr 260 0
23:58:06.564 -> :wr 260 0
23:58:07.080 -> :wr 260 0
23:58:07.080 -> :wtmo
23:58:07.080 -> :wr 260 0
23:58:07.642 -> :wr 260 0
23:58:08.111 -> :wr 260 0
23:58:08.627 -> :wr 260 0
23:58:09.189 -> :wr 260 0
23:58:09.705 -> :wr 260 0
23:58:10.221 -> :wr 260 0
23:58:10.689 -> :wr 260 0
23:58:11.205 -> :wr 260 0
23:58:11.721 -> :wr 260 0
23:58:12.236 -> :wr 260 0
23:58:12.236 -> :wtmo
23:58:12.236 -> :wr 260 0
23:58:12.752 -> :wr 260 0
23:58:13.268 -> :wr 260 0
23:58:13.783 -> :wr 260 0
23:58:14.252 -> :wr 260 0
23:58:14.768 -> :wr 260 0
23:58:15.283 -> :wr 260 0
23:58:15.799 -> :wr 260 0
23:58:16.314 -> :wr 260 0
23:58:16.783 -> :wr 260 0
23:58:17.299 -> :wr 260 0
23:58:17.299 -> :wtmo
23:58:17.299 -> :wr 260 0
23:58:17.814 -> :wr 260 0
23:58:18.330 -> :wr 260 0
23:58:18.893 -> :wr 260 0
23:58:19.408 -> :wr 260 0
23:58:19.971 -> :wr 260 0
23:58:20.486 -> :wr 260 0
23:58:21.002 -> :wr 260 0
23:58:21.540 -> :wr 260 0
23:58:22.056 -> :wr 260 0
23:58:22.561 -> :wr 260 0
23:58:22.561 -> :wtmo
23:58:22.561 -> :wr 260 0
23:58:23.124 -> :wr 260 0
23:58:23.639 -> :wr 260 0
23:58:24.155 -> :wr 260 0
23:58:24.718 -> :wr 260 0
23:58:25.233 -> :wr 260 0
23:58:25.749 -> :wr 260 0
23:58:26.311 -> :wr 260 0
23:58:26.827 -> :wr 260 0
23:58:27.343 -> :wr 260 0
23:58:27.858 -> :wr 260 0
23:58:27.858 -> :wtmo
23:58:28.139 -> :wustmo
23:58:28.139 -> :close
23:58:28.139 -> Message not accepted
23:58:28.139 -> 

I have patched the code as below, which seems to work, but I don't know this code well enough to know if it will cause other problems:

int WiFiClientSecure::_run_until(unsigned target, bool blocking) {
	//DEBUG_BSSL("_run_until: begin\n");
  if (!ctx_present()) {
    DEBUG_BSSL("_run_until: Not connected\n");
    return -1;
  }
  for (int no_work = 0; blocking || no_work < 2;) {
    if (blocking) {
      // Only for blocking operations can we afford to yield()
      optimistic_yield(100);
    }

    int state;
    state = br_ssl_engine_current_state(_eng);
    if (state & BR_SSL_CLOSED) {
      return -1;
    }

    if (!(_client->state() == ESTABLISHED) && !WiFiClient::available()) {
      return (state & target) ? 0 : -1;
    }

    /*
       If there is some record data to send, do it. This takes
       precedence over everything else.
    */
    if (state & BR_SSL_SENDREC) {
      unsigned char *buf;
      size_t len;
      int wlen;
	  int afw;

      buf = br_ssl_engine_sendrec_buf(_eng, &len);

// ---FIX FOR AVAILABLE() BLOCKING ISSUE --------------------------------------------------
	  if (!blocking && WiFiClient::availableForWrite() == 0) {
		DEBUG_BSSL("_run_until: client tx buffer is full, aborting to avoid blocking\n");
		yield(); // needed to avoid soft wdt reset when available() is called in a busy loop
		return -1; // tx buffer is full, write attempt will block
	  }
 // ---END FIX-----------------------------------------------------------------------------------

      wlen = WiFiClient::write(buf, len);
      if (wlen <= 0) {
        /*
           If we received a close_notify and we
           still send something, then we have our
           own response close_notify to send, and
           the peer is allowed by RFC 5246 not to
           wait for it.
        */
        return -1;
      }
      if (wlen > 0) {
        br_ssl_engine_sendrec_ack(_eng, wlen);
      }
      no_work = 0;
      continue;
    }

    /*
       If we reached our target, then we are finished.
    */
    if (state & target) {
      return 0;
    }

    /*
       If some application data must be read, and we did not
       exit, then this means that we are trying to write data,
       and that's not possible until the application data is
       read. This may happen if using a shared in/out buffer,
       and the underlying protocol is not strictly half-duplex.
       This is unrecoverable here, so we report an error.
    */
    if (state & BR_SSL_RECVAPP) {
      DEBUG_BSSL("_run_until: Fatal protocol state\n");
      return -1;
    }

    /*
       If we reached that point, then either we are trying
       to read data and there is some, or the engine is stuck
       until a new record is obtained.
    */
    if (state & BR_SSL_RECVREC) {
      if (WiFiClient::available()) {
        unsigned char *buf;
        size_t len;
        int rlen;

        buf = br_ssl_engine_recvrec_buf(_eng, &len);
        rlen = WiFiClient::read(buf, len);
        if (rlen < 0) {
          return -1;
        }
        if (rlen > 0) {
          br_ssl_engine_recvrec_ack(_eng, rlen);
        }
        no_work = 0;
        continue;
      }
    }

    /*
       We can reach that point if the target RECVAPP, and
       the state contains SENDAPP only. This may happen with
       a shared in/out buffer. In that case, we must flush
       the buffered data to "make room" for a new incoming
       record.
    */
    br_ssl_engine_flush(_eng, 0);

    no_work++; // We didn't actually advance here
  }
  // We only get here if we ran through the loop without getting anything done
  return -1;
}
@earlephilhower earlephilhower self-assigned this Aug 22, 2019
@earlephilhower
Copy link
Collaborator

Thanks for the detailed description and logs and the proposed fix. Does it also fail with John Donne, or is this a Shakespeare-only issue? :)

Why not submit a PR and get your name in the git logs? You've got a failing test and a proposed solution ready to go, which is really all you need!

@johnm545
Copy link
Contributor Author

johnm545 commented Aug 23, 2019 via email

@johnm545
Copy link
Contributor Author

johnm545 commented Aug 26, 2019

I have tested this some more and whipped up a simpler MCVE that does not require the MQTT client, just ugly HTTP String hacking. To reproduce it, you have to disconnect your internet in the first couple of minutes, because the postman echo server will close the connection after a couple of posts. Once the WiFiClient tx buffer fills, available() calls will block for 5 seconds.

The 15 second block I was seeing in the previous example occoured because the MQTT client was calling connected() first, which calls available(), then calling available() twice. The below example is a better test, but in practice the if connected() while available() pattern is common and results in long bocks.

#include <ESP8266WiFi.h>
#include <WiFiClientSecure.h>

const char* host = "postman-echo.com";
int port = 443;  //https
WiFiClientSecure wifiClient;

unsigned long lastSend;
unsigned long sendInterval = 60000L;

void setup() {
  wifiClient.setInsecure(); // disable validation of server certificate
  Serial.begin(115200);
  delay(1000);

  WiFi.begin(/*ssid, password if needed */);     //Connect to your WiFi router
  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    Serial.print(".");
  }
  Serial.println("wifi connected");

  if (!wifiClient.connect(host, port)) {
    Serial.println("initial connection failed, restart to try again");
    while (1) delay(1);
  }
  Serial.println("Secure socket connected to server. Disconnect internet without disconnecting wifi to demonstrate bad behaviour");
  lastSend = millis();
}

void loop() {

  if (millis() - lastSend > sendInterval) {
    lastSend = millis();

    String msg = "POST /post HTTP/1.1\r\n";
    msg += "Host: " + String(host) + "\r\n";
    msg += "Content-Type: application/x-www-form-urlencoded\r\n";
    msg += "Content-Length: 121\r\n\r\n";
    msg += "thequickbrownfoxjumpsoverthelazydogfjaoranckaeioavrjkanvjknvaeruivnaivnrakjnvkanvkanuarwefjrieavnkjanrlsnbiakjncoanranven\r\n";
    msg += "Connection: keep-alive\r\n\r\n";

    int afw = wifiClient.availableForWrite();

    Serial.print("Sending a message, unencrypted message length: "); Serial.println(msg.length());
    Serial.print("Bytes available for write in WiFiClient tx buffer: "); Serial.println(afw);
    Serial.println(msg);

    int msgLength = msg.length();
    int written = wifiClient.print(msg);

    Serial.print("request sent, bytes accepted by WiFiClientSecure: "); Serial.println(written);
    Serial.println("\n\n");
  }

  unsigned long beforeAvailableCall = millis();
  int avail = wifiClient.available();
  unsigned long elapsed = millis() - beforeAvailableCall;

  if (elapsed > 10) {
    Serial.print("available() blocked for "); Serial.print(elapsed); Serial.println(" millis!");
  }
  for (int i = 0; i < avail; i++) {
    byte b = wifiClient.read();
    Serial.print( (char)b );
  }
  if (avail) Serial.println();  

delay(1000);
}

Getting my head around how the _write and available methods work took quite a bit of head scratching, so I have brain-dumped my understanding below before I forgot.

The WiFiClientSecure::_write loop first does a _run_until(BR_SSL_SENDAPP). Usually this will return success instantly. But if there is unsent record (encrypted) data hanging around from last time (state & BR_SSL_SENDREC == true), WiFiClient::write is called. If this data can not fit in the WiFiClient tx buffer, then it will block until space is available or timeout occours (default 5sec). If some write activity occurs and there is still space in the bearssl sendapp buffer (state & BR_SSL_SENDAPP == true) then _run_until returns success; if WiFiClient::write returns 0 _run_unit returns -1 and the _write loop is broken, signaling a write failure.

WiFiClientSecure::_write then feeds the user's data into the bearssl sendapp buffer (as much as will fit), then calls br_ssl_engine_flush to force the engine to generate an encrypted record. Now state & BR_SSL_SENDREC == true but the data has not yet been written to the WiFiClient. So _write then calls flush(), which calls WiFiClientSecure::flush(0), which does a _run_until(BR_SSL_SENDAPP), causing the record data to be written to WiFiClient. WiFiClientSecure::flush then flushes the WiFiClient with the WiFiClient default timeout (300ms)

WiFiClientSecure::_write does not really care what happens in the flush() method; it returns the number of bytes transfered to bearssl sendapp buffer. It only returns a fail (i.e. 0 or less than it was asked to send) when the leftover data from the previous write attempt or loop iteration can not be sent to the WiFiClient. This is fine but means that writes on dropped connections will not fail until two _write attempts are made on the full-up WiFiClient, or the sendapp buff fills, whichever occurs first. It might be more elegant to avoid the blocking-but-apparently-successful write, but that's not simple.

WiFiClientSecure::available() calls _run_until(BR_SSL_RECVAPP, false), and this also triggers a blocking write attempt if there is still record data hanging around from previous write attempts (i.e. state & BR_SSL_SENDREC == true). The fix is to avoid writing to the WiFiClient more data than will fit in its tx buffer when the blocking flag == false; I will submit a PR with this change.

*The 15sec block occours because WiFiClientSecure::_write has a call to connected() at the top, which in turn calls available().

@devyte
Copy link
Collaborator

devyte commented Sep 17, 2019

Closing via #6449.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants