-
Notifications
You must be signed in to change notification settings - Fork 7
Best practices for multilingual values #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Riffing off of @pchampin's example in w3c/json-ld-syntax#91 (comment), we might use data indexing to aid access: {
"@context": {
"occupation": { "@id": "ex:occupation", "@type": "rdf:HTML", "@container": "@data" },
"description": "ex:description"
},
"name": "Yagyū Muneyoshi",
"occupation": {
"ja": "<span lang=\"en\">Ninja in japanese: <span lang=\"jp\">忍者</span>",
"en": "<span lang=\"en\">Ninja in english: <span lang=\"en\">Ninja</span>",
"cs": "<span lang=\"en\">Ninja in czech: <span lang=\"cs\"> Nindža </span>"
}
} This allows data indexing and consistent use of HTML values. |
But... what would the generated RDF look like? One cannot add a language tag to a typed literal:-( |
Its not a language tag, it’s a data index which has no RDF representation. It’s useful for creating structural indexes. |
Oops... well, this is one of those surprise effect that @ajs6f was talking about yesterday: I missed the Yes, it is legal; I do not think it is good practice. |
This issue was discussed in a meeting.
View the transcriptMultilingual ValuesBenjamin Young: https://github.com/w3c/json-ld-syntax/issues/105 Benjamin Young: Another easy one ;) … this one is about how JSON-LD currently works, and our past decisions to use HTML for multi lingual values (strings with multiple languages) … so use straight up HTML, which is not ideal … Looking at text level semantics HTML, but that’s for the future. … so what do we need to propose in the primer to close the issue? … related - there’s no way to do multi-language language maps Rob Sanderson: it seems we should split this into a primer issue … eg how do you use language tags … and what do you do with multiple languages … and then have a syntax issue around gkellogg’s issue for the normative specs Benjamin Young: …about, is it an error to have English and Japanese in a string that is stated to be only one of those Ivan Herman: What was put there by gregg sounds like a solution, but a bit misleading. The use of language tags gives the wrong impression — should be just indexes Ivan Herman: Language tags are defined by ISO Rob Sanderson: “<span lang="en">Ninja in japanese: 忍者“@ja Rob Sanderson: I agree ivan. to your question, the RDF would look like that: Rob Sanderson: "Ninja in japanese: 忍者"@ja^^rdf:langString Rob Sanderson: this has been my issue for 5+ years … language tags must be langString Ivan Herman: an RDF issue that is not ours to solve … Lots of nice discussions in dbooth’s repo, but it should happen in RDF not here … same as missing base direction … we can only set a single language. And this is the same as base direction, shouldn’t touch it Rob Sanderson: +1 to ivan Benjamin Young: RDF is woefully broken in this way, but Gregg’s proposal of HTML + language map would be desirable by JSON developers Rob Sanderson: https://iiif.io/api/presentation/3.0/#44-html-markup-in-property-values Benjamin Young: If built to contain HTML, they’re not going to take it into RDF, so a little misuse has advantages Ivan Herman: q= Benjamin Young: our audience is interested in JSON, with a side plate of a graph Rob Sanderson: I put this link in earlier https://iiif.io/api/presentation/3.0/#44-html-markup-in-property-values … it uses exactly what gkellogg describes … it is common and exactly what people want to be able to do Ivan Herman: The funny thing is what you wrote is legal but ugly RDF – a microsyntax for a string, which is outside of RDF or JSON-LD … it happens to be a subsyntax of HTML … don’t need anything in the syntax document to do this, its a private agreement between parties Rob Sanderson: +1 to Ivan Ivan Herman: this is probably the only thing we can do … so no issue in the syntax document … it’s an ugly but best practice given the current technologies Pierre-Antoine Champin: Going to propose a crazy idea, in the line of what Ivan said. We don’t need to change RDF, we could define a custom datatype. langString is syntactic sugar for a standard datatype for a more ugly microsyntax of the language inside the value … we could define a more complex but similar datatype. That’s the crazy idea :) We could instrument it in RDF, with another container type, so that what gregg proposed would generate the appropriate structure … but it’s quite some work Ivan Herman: technically … yes … and now I put on the W3C hat, it’s outside of our charter. This would be a RDF datatype. Pierre-Antoine Champin: What about JSON data type? Ivan Herman: JSON is closer to our charter. But language isn’t. … it would be a lot of work … the flood gates would be open. Ruby, direction, etc. Benjamin Young: https://w3c.github.io/string-meta/ Benjamin Young: worth pausing on the JSON data type. I hear the concerns … is there a way around them? This string-meta document from i18n suggests JSON-LD as a solution for multi-language use … feel that there’s an opportunity here … And if we miss it, there’ll be a lot of terrible looking JSON-LD … I see that it evokes process specters, but it comes up a lot … The genie won’t go back into the bottle. So any hope of this? Ivan Herman: Don’t remember the issue, but got into a long discussion with the editors. The examples are mostly wrong. Benjamin Young: https://github.com/w3c/string-meta/issues/27 Benjamin Young: also w3c/string-meta#13 Ivan Herman: I understand the problem. Would love for the problem to be solved, but outside our influence Benjamin Young: oh…and w3c/string-meta#23 Ivan Herman: I don’t see any other proper way, other than having it done at the RDF level. Benjamin Young: …and another w3c/string-meta#11 Rob Sanderson: The bigger risk is to build on shifting sands and have RDF come up with a different syntax that’s incompatible with whatever we come up with … should instead use it as a way to highlight the need, and potentially a micro-chartered group to solve it for RDF Benjamin Young: Not ready to recharter, or make a new datatype. Rob proposes to kick it to another group and then an update to JSON-LD. Not a solution, but don’t want to lose the actions … to close the issue we should state what can be done … but need to be clear as to what /should/ be done that’s not confusing Jeff Mixter: +1 to that Proposed resolution: highlight the need for work is ongoing, but it should present what can be done today via language/data maps and/or using HTML (or other) micro-syntax for expressing multiple language (Benjamin Young) Rob Sanderson: +1 Benjamin Young: +1 Jeff Mixter: +1 Ivan Herman: +1 Tim Cole: +1 Pierre-Antoine Champin: +1 Simon Steyskal: +1 Adam Soroka: +1 Resolution #5: highlight the need for work is ongoing, but it should present what can be done today via language/data maps and/or using HTML (or other) micro-syntax for expressing multiple language Ivan Herman: procedural question - if we close the issue, then I think we will lose it for the bp doc. For the time being we don’t have an editor for the document. So don’t want it lost. … should be raised in the BP repo Rob Sanderson: +1 Benjamin Young: +1 Ivan Herman: should go through the issues to make sure we don’t lose them Benjamin Young: Agreed – open editorial issues on BP? … keep these initial discussion in the syntax doc, to not have the comments scattered Ivan Herman: Wouldn’t close this one Simon Steyskal: https://github.com/w3c/json-ld-bp/issues Benjamin Young: not until there’s another issue to write it up Ivan Herman: editor will write it up as they see best Benjamin Young: And it’s the top of the hour … thanks for all the input |
Do we need a short new section on multilingual value issues? |
Possible routes:
Other topics to include:
|
This issue was discussed in a meeting.
View the transcriptMultilingual PatternsRob Sanderson: #5 Rob Sanderson: adam had noted that there is some confusion about how to use multilingual data values alongside language maps Ivan Herman: I think two things are intertwined here … the first is the use of language map, possibly with direction, … the second is the use HTML literals. … I would prefer to separate them in BP. … gkellogg’s proposal was a hack to use almost the same syntax for two cases, … which is pretty convoluted. It works, but should this be BP? Gregg Kellogg: in one case, this is a language map; in the other case, this is data indexing. Ivan Herman: yes, but using language tags for data-indexing is misleading. … It mislead me. Gregg Kellogg: language maps reflect in the RDF abstract syntax; data indexing is lost in the process. Ivan Herman: the example is convoluted because it uses rdf:HTML, … which I don’t think is very frequent. Rob Sanderson: should we also discuss @none in this context?Ivan Herman: yes |
As evidenced by reports, there is some confusion about how to use multilingual data values alongside language maps. @pchampin noted that using an alias is a good way to work through this, and @BigBlueHat noted (link to minutes forthcoming) that this is the approach taken by Web Annotations. We should offer some examples of this practice, probably in the context of the (long-promised) Primer.
The text was updated successfully, but these errors were encountered: