Skip to content

UTF-8 strings are broken, when not save sketch before compiling #4231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BlackBrix opened this issue Nov 30, 2015 · 14 comments
Closed

UTF-8 strings are broken, when not save sketch before compiling #4231

BlackBrix opened this issue Nov 30, 2015 · 14 comments
Labels
Component: Compilation Related to compilation of Arduino sketches Component: IDE The Arduino IDE Type: Bug

Comments

@BlackBrix
Copy link

see this thread in forum:
--> http://forum.arduino.cc/index.php?topic=360932.msg2498388#msg2498388


I use 1.6.6 in a portable version on an USB-Stick (Win7 / Win 10).

I disabled the option "save when verifying or uploading"

If you are working on a file (ino, h, cpp) with UTF-8 chars in string-declarations within this file
(see "extended example" from here
--> http://forum.arduino.cc/index.php?topic=360932.msg2488750#msg2488750 ),
and you change something (it's enough to add a space char in a string),
and you DO NOT save this file before compiling and uploading,
the serial output of this UTF-8 chars (via UTF-8 capable Terminal) will be with errors.

If you save this file before compiling and uploading,
(either manually Ctrl+S or by enabling the option "save when verifying or uploading")
the serial output of this UTF-8 chars (via UTF-8 capable Terminal) will work correctly.

...

so it seems that when the IDE is compiling "from cache" (or from temp. build -folder)
there's no proper conversion or handling of UTF-8 chars on Windows OS.
The process of saving the sketch before (to a file on HDD / USB-Stick) seems to lead to a proper conversion or handling of any UTF-8 chars then ...
....

other users can reproduce this error as well
--> http://forum.arduino.cc/index.php?topic=360932.msg2499702#msg2499702

@Koepel
Copy link

Koepel commented Nov 30, 2015

It seems to happen only in Windows (so far I was not able to get this in linux), and only with UTF-8 characters, and only when a file is not saved. The temperary file in the build.... folder is not UTF-8.

@ffissore ffissore self-assigned this Nov 30, 2015
@ffissore ffissore added the Component: Compilation Related to compilation of Arduino sketches label Nov 30, 2015
@lmihalkovic
Copy link

see #4259 for more encoding issues

@BlackBrix
Copy link
Author

"Federico Fissore" left arduino and now all "his" issues here are in the state "no one assigned" ?

@lmihalkovic
Copy link

Yes it seems that's how github works. I'm sure the company is organizing itself for the after. Arduino Create also seems to be nearing general availability.

@BlackBrix
Copy link
Author

does someone work on this issue ?

@lmihalkovic
Copy link

I think the vast majority of entries are not looked into by anyone.

@BlackBrix
Copy link
Author

any news on this ?

@per1234
Copy link
Collaborator

per1234 commented Jul 4, 2017

Steps to reproduce:

  • File > New
  • Paste the following code into the new sketch (leaving it in an unsaved state):
void setup() {
  Serial.begin(9600);
  while (!Serial) {}
  Serial.println( "¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ­ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿");
}

void loop() {}
  • Sketch > Upload

  • Tools > Serial Monitor

  • Serial Monitor displays (incorrectly):
    clipboard01

  • File > Save (now the sketch is in the saved state)

  • Sketch > Upload

  • Tools > Serial Monitor

  • Serial Monitor displays (correctly):
    clipboard02

Issue still occurs with Arduino IDE 1.8.3, Windows 7 64 bit

@PaulStoffregen
Copy link
Contributor

PaulStoffregen commented Jul 4, 2017

I tested just now. I was not able to reproduce this problem on Macintosh or Linux 64 bit.

It does happen with Windows 10.

capture

@PaulStoffregen
Copy link
Contributor

PaulStoffregen commented Jul 4, 2017

This issue might be related to line 584 in arduino-core/src/processing/app/legacy/PApplet.java

      OutputStreamWriter osw = new OutputStreamWriter(output, "UTF-8");

Apparently OutputStreamWriter expects "UTF8", not "UTF-8" with the extra minus sign.

https://docs.oracle.com/javase/tutorial/i18n/text/stream.html

There's also another "UTF-8" usage on line 265 which might be worth fixing.

@PaulStoffregen
Copy link
Contributor

Nope, just tried editing line 584, but it doesn't fix this issue. :(

@facchinm
Copy link
Member

facchinm commented Jul 5, 2017

The right place to look at seems to be https://github.com/arduino/Arduino/blob/master/app/src/processing/app/SketchController.java#L660 (mostly getBytes() call). The intermediate file contains broken UTF8 codes, thus the resulting binary is broken too. Maybe using the same routine we use to actually save the file could solve the issue?

@ZinggJM
Copy link

ZinggJM commented Oct 2, 2018

I do have this issue with the Arduino IDE (1.8.5 and 1.8.7) on Windows 10.
I have created a test example (with a readme.h) :
https://github.com/ZinggJM/leftovers/tree/master/examples/Umlauts
The issue shows up with U8g2_for_Adafruit_GFX in my GxEPD libraries for e-paper displays.

@BlackBrix
Copy link
Author

BlackBrix commented Oct 7, 2018

I can confirm, that this issue is still open at 1.8.7 on Windows 10

@facchinm facchinm added this to the Release 1.8.8 milestone Oct 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Compilation Related to compilation of Arduino sketches Component: IDE The Arduino IDE Type: Bug
Projects
None yet
Development

No branches or pull requests

9 participants