-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Progmem F() Strings Manager #4270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't understand why you wouldn't do the top part, then just use those directly. Its seems more readable than F_(52). |
@Makuna But comes the question of how much code space is used for joining strings.
|
Actually what would be good is a resource file area in an unused section of flash. Then all the strings could live in upper flash just below the FS and not take up any of the limited Sketch space. If what I hear about the Maximum Sketch size being 1MB then to off-load static resources from Plus, updates would be faster if no resources had changed. (Extra bonus) After having thought for a while, that's it, the coffee ran out, the effects of the coffee expired, I'm done for today. Hv. |
speaking of “resources”, you shouldn’t be using the “String” class. Character arrays are always the best. |
@tonton81 If I ever do a production thing I would take more care.! I played around with the EEPROM libs, Hv. |
I beg to disagree on that point. |
@vdeconinck As far as maintenance, having all the resources in the one place seems like an easier approach. But hey that's just my lazy mind stretching it's legs, |
@Hiddenvision many experienced programmers agree that you should aim for 1.correctness 2.readabilty 3.performance, in that order, and 3. only if necessary ("premature optimization is the root of all evil"). |
I'd be switching 2 & 3 around myself. My background comes more from hacking and reverse engineering so prefer assembler if honest but c and such things do make portability easier. and being able to edit all the "Strings" in one single file, it worked for me But lets be real, bugs are our own creation mostly, At the end of the day the compiler does not care about what stuff looks like as long as it fits thru the sieve. But don't think I do not understand the need for well structured stuff in it's place. However, I would be curious to know if doing such a thing would result in a "space" saving. |
@Hiddenvision you are correct: when strings are duplicate under normal use, the build process (supposedly) unifies duplicates. However, when those strings are moved to flash, then that optimization no longer works. Therefore, if there are duplicate strings (and we do have a few of those), then they are wastefully increasing the bin size. Removing duplicates therefore saves bin space. As commented above, your original code structure has one Bad Idea: the switch statement with numerical index. This makes usage like F(55) very unreadable for posterity. True, code is to be executed, but after it executes the first time, that code has to be maintained. Bugs can appear, restructuring can be needed, rewrites, or even just a look to check on some behavior detail. It is because of these reasons that readability is important: code lives on after it has been written for the first time. If strings are going to be moved to global space, one option is to have each string in its own variable, like you have above, and access them directly via that variable. The other option is to have all strings in a pointer array, and access via an index, in which case an enum with meaningful names is better than a numerical index that says nothing about the string. Here, you don't even need your function with the switch. About your idea of a resource manager. I am currently doing a minor overhaul of our esp8266webserver, moving several strings to flash, others to global space and removing duplicates, and so on, and I have to say I find the process a bit painful, so I was thinking along similar lines. However, having a tool restructure your code for you makes me cringe a bit, especially when thinking about maintaining that tool cross-platform. Consider a subtle bug in that tool, which ends up messing up some user's code... Problem for me is: there is no way I could justify spending the time to develop something like that on my end. Are you willing to tackle this yourself? |
IMHO and experience: you run out of RAM long before you run out of flash. at least in general. there might be applications where this could be different, but then you can think about special deduplication strategies that are tailored to that use case. |
@devyte Wise words indeed. Just to be clear I was not thinking of anything that made changes to anyone's core code. My thought, The Index's could be given "Names" but I fear that some of that may creep it's way into the binary depending on debug level as I am sure I see raw function names in the bin. I would be happy to have a look at what is required to do this on a multi platform level but I would need to finish my own tool so it is publicly safe before I go creating larger monsters. As mentioned my background is more taking stuff apart, understanding and improving, I do "currently" have a HUGE amount of free time but not as much of the training and code discipline that maybe required (I score 0% in both!). @everslick Most of my initial coding was done in .5k or 1k of PIC space. Hv. |
@Hiddenvision please contact me via gitter, there's something I'd like to discuss in addition to this :) |
fire away. hv. |
Oh the side note. Example: Someone mentioned trying debug NoAssert-NDEBUG If this sort of info being in the final binary is unavoidable then it should be considered to move things to a lower lever folder to avoid "LONG" paths being included. Does anyone else experience this.? Hv. |
The long file names are gcc's internal macro |
Hi David, Re the filenames, Still don't understand why they have to end up in the final bin.
I shall turn off my use of the Hv. |
That's fine when DEBUG is defined. We need them. edit if this DEBUG macro is yours, do you not define it before checking if the strings are present ? edit2 on unix, |
Yep, NoAssert ticked in menus. Should I extract all the Strings and share.? |
Yes please, filenames should be there no more with the option |
There may be strings, but if they're not in .RODATA then they do not affect your free memory, only the code size. So please double-check these are not just in .TEXT and not .RODATA. "strings file.elf" is not a reliable indicator of where the string is located, you really do need to use objdump to check it. I still see only 2 full paths in RODATA, abi.cpp and core_esp8266_main.cpp with -NDEBUG enabled in a random sample. #4187 should remove a few add'l core strings, if it merges. |
@Hiddenvision btw, I can't find you on gitter. I think you have to log in there at least once first. |
@devyte , Sorry dude, What's Gitter.? This was a quick strip of the Text found. As you can C, I am the worst offender with duplicates, and wasted space all over the place. Ooo, actually the split is around line 770. |
@earlephilhower . Re #4187. Oh yer, I never gave that any thought. Der. Having now created that text file, I must now tidy up my own work. I think it was only the flash allocation and the failing http.updates with 2.4 that got me thinking this deep. |
@Hiddenvision you can run "objdump -s -j .rodata file.elf" (may need to use the xtensa-lx106-elf-objdump.exe if you're under Windows, it's located in the tools/xtensa-*/bin directory. It's not critical if you only care about total upload size (i.e. trying to fit an OTA upgrade in <500KB), but the rodata stuff is very important if you're pressed for RAM and getting OOM exceptions. |
@earlephilhower if I need to dig deep I normally drop the elf into IDA. |
Closing as this has gone stale. If you've got a PR, please do feel free to submit it for review. Thanks. |
To answer on @Hiddenvision initial questions I would like to add the following two points: 1.) Optimizing space usage by using some lookup table and trading it for some processing time is the foremost application of the LZV algorithm. So if you are really short on PROGMEM you may consider implementing LZV for ESP8266. 2.) FPSTR() will aquire PROGMEM in 4byte chunks for 32bit access. Hence the letter count you were asking for is in chunks of 4bytes and the remaining bytes are wasted. Maybe this should be considered in your algorithm by working on 4byte chunks. Counting and Sorting of the 4byte patterns according to the frequency of their occurrence would be along the general LZV problem and solution. For starters I would already be happy to have the Warning messages @devyte mentioned in his String checker suggestion. |
Unfortunately, the code checker idea hasn't been picked up. |
Again, Not an issue but more of an Improvement thought.
I read that using Progmem for read only strings is all cool and dandy
but the compiling/linking process does not optimise repeated text used.
I was toying with a Strings manager concept that would handle all the static strings.
So you could perhaps litter your code with
F_(55)
where strings are required.Save all the Indexed Strings in a txt file with a little tool to (add/search/replace/delete) entries.
F_(int i); being a simple function to return a constructed string.
An example optimizedStrings.h file
As you can see from this random example trying to create and maintain this sort of thing by hand would be a nightmare but if technically this could save some space then to automate the creation of the builder function and const char progmem bits, is relatively easy to do from my point.
As a precompile process all the user strings can be studied and duplicates optimised.
Simple little exe to buz thru an indexed txt file doing the pre optimising stuff and outputting a fresh include file ready for compile.
Just wondering if this would be worth looking at or would it be a wasted folly.
I am no pro coder, I'll slap on more code than sunblock, until the resources run low,
and I'll be the first to say I can over complicate a problem that does not exist just for the fun of it.
It may even be the worst thing to even consider and produce horrendous output,
happy to take any comment.!
Now that I can do OTA > 500k, perhaps I wont even think of it again.
But wasted space is ditto.
I did this on the little Atmel devices when I was running low on space, 32k was easy to fill.!
That was not automated, but like I said, 32k is easy to fill, so I had limited text to manage.
It could be calculated that saving space takes time, so there is always a trade-off.
Hv
The text was updated successfully, but these errors were encountered: