Skip to content
This repository was archived by the owner on Jan 17, 2024. It is now read-only.

Helpers for null-terminated Utf8 #3

Merged
merged 9 commits into from
Sep 9, 2019
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.packages
pubspec.lock
.dart_tool
7 changes: 7 additions & 0 deletions lib/ffi.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
// Copyright (c) 2019, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

library ffi;

export 'src/utf8.dart';
40 changes: 40 additions & 0 deletions lib/src/utf8.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
// Copyright (c) 2019, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

library utf8;

import 'dart:convert';
import 'dart:ffi';
import 'dart:typed_data';

/// [Utf8] implements conversion between Dart strings and null-termianted
/// Utf8-encoded "char*" strings in C.
class Utf8 extends Struct<Utf8> {
static String fromUtf8(Pointer<Utf8> str) {
final Pointer<Uint8> array = str.cast();
int count = 0x1000;
Uint8List string = array.asExternalTypedData(count: count);
int i = 0;
for (; string[i] != 0; ++i) {
if (i == count) {
count *= 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use a precise bound here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the length of the string can be only be detected by scanning until the first NULL byte.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, why times 2 every time? Why not scan until null and use that as count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is scanning until null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was confused by array.asExternalTypedData(count: count) where that count is potentially 2x the string length. We need an oversized TypedData wrapper in order to not create a new wrapper every character for scanning in the first place. (I hope we can switch to the normal Pointer<Int8>.load() soon, that would make it easier to read this code.)

string = array.asExternalTypedData(count: count);
}
}
return Utf8Decoder().convert(Uint8List.view(string.buffer, 0, i));
}

static Pointer<Utf8> toUtf8(String s) {
final List<int> units = Utf8Encoder().convert(s);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Utf8Encoder.convert this is now typed to return Uint8List. You may want to use that as type.

You can use const Utf8Encoder() to avoid allocating a new object each time (if you care).
That's the same object returned by utf8.encoder as well, and you could also just write utf8.encode(s) for exactly the same result.
Easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the new type:

abstract class Encoding extends Codec<String, List<int>> {

final Pointer<Uint8> result =
Pointer<Uint8>.allocate(count: units.length + 1);
final Uint8List string =
result.asExternalTypedData(count: units.length + 1);
string.setAll(0, units);
string[units.length] = 0;
return result.cast();
}

String toString() => fromUtf8(addressOf);
}
1 change: 1 addition & 0 deletions pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ dependencies:

dev_dependencies:
pedantic: ^1.0.0
test: ^1.6.8
14 changes: 14 additions & 0 deletions test/utf8_test.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Copyright (c) 2019, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

import 'package:test/test.dart';
import 'package:ffi/ffi.dart';

main() {
test("fromUtf8 . toUtf8 is identity", () {
final String start = "Hello World!\n";
final String end = Utf8.fromUtf8(Utf8.toUtf8(start));
expect(end, equals(start));
});
}