Skip to content

src, doc: improve documentation and error message for ICU data fallback #49666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 26 additions & 6 deletions doc/api/intl.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,27 +113,47 @@ This mode provides a balance between features and binary size.

If the `small-icu` option is used, one can still provide additional locale data
at runtime so that the JS methods would work for all ICU locales. Assuming the
data file is stored at `/some/directory`, it can be made available to ICU
through either:
data file is stored at `/runtime/directory/with/dat/file`, it can be made
available to ICU through either:

* The `--with-icu-default-data-dir` configure option:

```bash
./configure --with-icu-default-data-dir=/runtime/directory/with/dat/file --with-intl=small-icu
```

This only embeds the default data directory path into the binary.
The actual data file is going to be loaded at runtime from this directory
path.

* The [`NODE_ICU_DATA`][] environment variable:

```bash
env NODE_ICU_DATA=/some/directory node
env NODE_ICU_DATA=/runtime/directory/with/dat/file node
```

* The [`--icu-data-dir`][] CLI parameter:

```bash
node --icu-data-dir=/some/directory
node --icu-data-dir=/runtime/directory/with/dat/file
```

(If both are specified, the `--icu-data-dir` CLI parameter takes precedence.)
When more than one of them is specified, the `--icu-data-dir` CLI parameter has
the highest precedence, then the `NODE_ICU_DATA` environment variable, then
the `--with-icu-default-data-dir` configure option.

ICU is able to automatically find and load a variety of data formats, but the
data must be appropriate for the ICU version, and the file correctly named.
The most common name for the data file is `icudt6X[bl].dat`, where `6X` denotes
The most common name for the data file is `icudtX[bl].dat`, where `X` denotes
the intended ICU version, and `b` or `l` indicates the system's endianness.
Node.js would fail to load if the expected data file cannot be read from the
specified directory. The name of the data file corresponding to the current
Node.js version can be computed with:

```js
`icudt${process.versions.icu.split('.')[0]}${os.endianness()[0].toLowerCase()}.dat`;
```

Check ["ICU Data"][] article in the ICU User Guide for other supported formats
and more details on ICU data in general.

Expand Down
11 changes: 8 additions & 3 deletions src/node.cc
Original file line number Diff line number Diff line change
Expand Up @@ -916,9 +916,14 @@ static ExitCode InitializeNodeWithArgsInternal(

// Initialize ICU.
// If icu_data_dir is empty here, it will load the 'minimal' data.
if (!i18n::InitializeICUDirectory(per_process::cli_options->icu_data_dir)) {
errors->push_back("could not initialize ICU "
"(check NODE_ICU_DATA or --icu-data-dir parameters)\n");
std::string icu_error;
if (!i18n::InitializeICUDirectory(per_process::cli_options->icu_data_dir,
&icu_error)) {
errors->push_back(icu_error +
": Could not initialize ICU. "
"Check the directory specified by NODE_ICU_DATA or "
"--icu-data-dir contains " U_ICUDATA_NAME ".dat and "
"it's readable\n");
return ExitCode::kInvalidCommandLineArgument;
}
per_process::metadata.versions.InitializeIntlVersions();
Expand Down
23 changes: 14 additions & 9 deletions src/node_i18n.cc
Original file line number Diff line number Diff line change
Expand Up @@ -54,20 +54,21 @@
#include "util-inl.h"
#include "v8.h"

#include <unicode/utypes.h>
#include <unicode/putil.h>
#include <unicode/timezone.h>
#include <unicode/uchar.h>
#include <unicode/uclean.h>
#include <unicode/ucnv.h>
#include <unicode/udata.h>
#include <unicode/uidna.h>
#include <unicode/ucnv.h>
#include <unicode/utf8.h>
#include <unicode/utf16.h>
#include <unicode/timezone.h>
#include <unicode/ulocdata.h>
#include <unicode/urename.h>
#include <unicode/ustring.h>
#include <unicode/utf16.h>
#include <unicode/utf8.h>
#include <unicode/utypes.h>
#include <unicode/uvernum.h>
#include <unicode/uversion.h>
#include <unicode/ustring.h>

#ifdef NODE_HAVE_SMALL_ICU
/* if this is defined, we have a 'secondary' entry point.
Expand Down Expand Up @@ -569,8 +570,7 @@ ConverterObject::ConverterObject(
}
}


bool InitializeICUDirectory(const std::string& path) {
bool InitializeICUDirectory(const std::string& path, std::string* error) {
UErrorCode status = U_ZERO_ERROR;
if (path.empty()) {
#ifdef NODE_HAVE_SMALL_ICU
Expand All @@ -583,7 +583,12 @@ bool InitializeICUDirectory(const std::string& path) {
u_setDataDirectory(path.c_str());
u_init(&status);
}
return status == U_ZERO_ERROR;
if (status == U_ZERO_ERROR) {
return true;
}

*error = u_errorName(status);
return false;
}

void SetDefaultTimeZone(const char* tzid) {
Expand Down
2 changes: 1 addition & 1 deletion src/node_i18n.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
namespace node {
namespace i18n {

bool InitializeICUDirectory(const std::string& path);
bool InitializeICUDirectory(const std::string& path, std::string* error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why std::string* and not const std::string&?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an out parameter, the caller gets the error string if it fails


void SetDefaultTimeZone(const char* tzid);

Expand Down
30 changes: 18 additions & 12 deletions test/parallel/test-icu-data-dir.js
Original file line number Diff line number Diff line change
@@ -1,27 +1,33 @@
// Flags: --expose-internals
'use strict';
const common = require('../common');
const { spawnSyncAndExit } = require('../common/child_process');
const { internalBinding } = require('internal/test/binding');
const os = require('os');

const { hasSmallICU } = internalBinding('config');
if (!(common.hasIntl && hasSmallICU))
common.skip('missing Intl');

const assert = require('assert');
const { spawnSync } = require('child_process');

const expected =
'could not initialize ICU (check NODE_ICU_DATA or ' +
`--icu-data-dir parameters)${os.EOL}`;

{
const child = spawnSync(process.execPath, ['--icu-data-dir=/', '-e', '0']);
assert(child.stderr.toString().includes(expected));
spawnSyncAndExit(
process.execPath,
['--icu-data-dir=/', '-e', '0'],
{
status: 9,
signal: null,
stderr: /Could not initialize ICU/
});
}

{
const env = { ...process.env, NODE_ICU_DATA: '/' };
const child = spawnSync(process.execPath, ['-e', '0'], { env });
assert(child.stderr.toString().includes(expected));
spawnSyncAndExit(
process.execPath,
['-e', '0'],
{ env },
{
status: 9,
signal: null,
stderr: /Could not initialize ICU/
});
}