Dataset/LSA-T (#88)

cleong110 · web-flow · commit a433a174fecf · 2024-06-20T21:57:27.000+02:00
* LSA-T JSON and citation

* CDL: adding a comment

* CDL: requested changes for PR
diff --git a/src/datasets/LSA-T.json b/src/datasets/LSA-T.json
@@ -0,0 +1,15 @@
+{
+  "pub": {
+    "name": "LSA-T",
+    "year": 2022,
+    "publication": "dataset:dal2022lsa",
+    "url": "https://midusi.github.io/LSA-T/"
+  },
+  "features": ["video:RGB","text:es","pose:AlphaPose"],
+  "language": "Argentina",
+  "#items": null,
+  "#samples": "14,880 sentences",
+  "#signers": 103,
+  "license": "MIT",
+  "licenseUrl": "https://github.com/midusi/LSA-T/blob/main/LICENCE"
+}
diff --git a/src/index.md b/src/index.md
@@ -1028,6 +1028,10 @@ Research papers which do not necessarily contribute new theory or architectures
 
 @dataset:joshi-etal-2023-isltranslate introduce ISLTranslate, a large translation dataset for Indian Sign Language based on publicly available educational videos intended for hard-of-hearing children, which happen to contain both Indian Sign Language and English audio voiceover conveying the same content. They use a speech-to-text model to transcribe the audio content, which they later manually corrected with the help of accompanying books also containing the same content. They also use MediaPipe to extract pose features, and have a certified ISL signer validate a small portion of the sign-text pairs. They provide a baseline based on the architecture proposed in @camgoz2020sign, and provide code.
 
+<!-- TODO: LSA-T aka dataset:dal2022lsa, they use AlphaPose "with the Halpe full-body keypoints format", a visualizer tool, and a baseline SLT model. Especially might be good to mention FiftyOne https://docs.voxel51.com/, "which
+provides useful features such as allowing to filter samples by label, video, playlist,
+or by the confidence score of the signer inference." -->
+
 ###### Bilingual dictionaries {-}
 for signed language [@dataset:mesch2012meaning;@fenlon2015building;@crasborn2016ngt;@dataset:gutierrez2016lse] map a spoken language word or short phrase to a signed language video.
 One notable dictionary, SpreadTheSign\footnote{\url{https://www.spreadthesign.com/}} is a parallel dictionary containing around 25,000 words with up to 42 different spoken-signed language pairs and more than 600,000 videos in total. Unfortunately, while dictionaries may help create lexical rules between languages, they do not demonstrate the grammar or the usage of signs in context.
diff --git a/src/references.bib b/src/references.bib
@@ -3443,3 +3443,17 @@ @misc{SiMAX2020SignLanguage
  url = {https://cordis.europa.eu/project/id/778421},
  urldate = {2024-06-18}
 }
+
+@inproceedings{dataset:dal2022lsa,
+ address = {Berlin, Heidelberg},
+ author = {Dal Bianco, Pedro and R\'{\i}os, Gast\'{o}n and Ronchetti, Franco and Quiroga, Facundo and Stanchi, Oscar and Hasperu\'{e}, Waldo and Rosete, Alejandro},
+ booktitle = {Advances in Artificial Intelligence – IBERAMIA 2022: 17th Ibero-American Conference on AI, Cartagena de Indias, Colombia, November 23–25, 2022, Proceedings},
+ doi = {10.1007/978-3-031-22419-5_25},
+ isbn = {978-3-031-22418-8},
+ numpages = {12},
+ pages = {293–304},
+ publisher = {Springer-Verlag},
+ title = {{LSA-T}: The First Continuous {Argentinian Sign Language} Dataset for Sign Language Translation},
+ url = {https://doi.org/10.1007/978-3-031-22419-5_25},
+ year = {2023}
+}