Skip to content

RFC: Info messages to stdout from training utils #3998

Open
@amitdo

Description

@amitdo

https://groups.google.com/g/tesseract-dev/c/OoBUOPZtkrQ

@zdenop, I think your message will be more visible here as an RFC.

Zdenko Podobny
Jan 20

I realized that several users did not recognize errors during the training process.
IMO part of the problem is that all messages (error and standard output) from training tools are shown in stderr because of tprinf usage.

While this make sense in tesseract executable (OCR process output is sent to stdout, all other messages to stderr), in training we should use different approach: only errors (e.g. that should stop further process) should go to stderr and all other info should go to stdout.

Good example is unicharset_extractor:

tprintf("Extracting unicharset from box file %s\n", argv[arg]);
bool res = ReadMemBoxes(-1, /*skip_blanks*/ true, &file_data[0],
/*continue_on_failure*/ false, /*boxes*/ nullptr, &texts,
/*box_texts*/ nullptr, /*pages*/ nullptr);
if (!res) {
tprintf("Can not read box data from '%s'\n", argv[arg]);
return EXIT_FAILURE;
}
} else {
tprintf("Extracting unicharset from plain text file %s\n", argv[arg]);

Are you ok with this proposal? This would mean that tprinf will be used for errors, and std::cout/fprintf(stdout for rest...

Zdenko

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions