Update BosphorusSign #49

cleong110 · 2024-05-25T12:39:09Z

Updating BosphorusSign details including broken link, adding features for Kinectv2.

TODO a second pull request adding BosphorusSign22k, the updated release

#48 details.

AmitMY · 2024-05-26T09:41:38Z

src/datasets/BosphorusSign.json

-  "#items": 636,
-  "#samples": "24,161 Samples",
+  "#items": 595,
+  "#samples": "22,670 Samples",


BosphorusSign Turkish Sign Language corpus, which consists of 855 sign and p

https://aclanthology.org/L16-1220.pdf

Where is this info from? this is the number for Bosphorus22K, no?

Table 3 of the 22k paper lists statistics for both datasets.

Edit: https://arxiv.org/pdf/2004.01283 is the link for 22k

I'm frankly not sure where the original figure of 636 lexicon and 24,161 clips comes from, so I went with the info from the updated citation. Presumably if we went through the dataset access process now and specifically asked for BosphorusSign, not BosphorusSign22k, this is what we'd get?

In the original BosphorusSign citation the number given is 855, not 636 or 595, we have:

"The corpus contains 855 signs" in the conclusion section

Table 2 talks about modalities/features

Table 1 talks about other datasets

"We have collected 855 signs and phrase samples..." in the introduction section

"When completed, the corpus will have at least six repetitions of each sign per-
formed by 10 signers, giving a wide variance to the data."

What I presume happened is that between the two papers they decided to trim down the "publicly available" data to 595 signs.

Edit: and of course 855 is listed in table 3 of the BosphorusSign22k paper as well, as the overall lexicon size rather than the publicly available subset.

It also seems that for whatever reason the "when completed... 10 signers" did not happen, as the newer citation lists only 6, and has this to say:

Our dataset is based on the BosphorusSign (Cam-
goz et al., 2016c) corpus which was collected with the pur-
pose of helping both linguistic and computer science com-
munities. It contains isolated videos of Turkish Sign Lan-
guage glosses from three different domains: Health, finance
and commonly used everyday signs. Videos in this dataset
were performed by six native signers, as shown in Figure
1, which makes this dataset valuable for user independent
sign language studies.

"this dataset" I interpreted to mean that BosphorusSign, meaning that both BosphorusSign and BosphorusSign22k have the same number of signers, namely 6.

So the question here is whether to go with overall stats, or stats for the "publicly available" subset I suppose.

i think overall stats are more "correct" to use. thanks for checking!

Hmmm then in that case I'm not sure what to put for "number of clips". Because Table 3 only has "-" for that. Looking through both papers here's the candidates:

1257, the figure directly above in the table, from HospiSign. That seems unlikely. This dataset has way more signs, signers, etc.

22670, the figure directly below. But that's the reduced publicly available set.

855 signs6 signers/sign4 repetitions/signer = 20520?

I think I will just compromise and list it in the JSON with a little note?

cleong110 added 2 commits May 25, 2024 08:34

CDL: updating BosphorusSign details

7446e7d

Merge branch 'master' into dataset/BosphorusSign_update

b2253a9

AmitMY requested changes May 26, 2024

View reviewed changes

cleong110 added 2 commits May 28, 2024 10:27

Merge branch 'master' into dataset/BosphorusSign_update

19be425

CDL: filling out BosphorusSign items, samples, license, contact

f24c113

AmitMY approved these changes May 29, 2024

View reviewed changes

AmitMY merged commit 969a923 into sign-language-processing:master May 29, 2024
1 check failed

cleong110 mentioned this pull request Jun 4, 2024

Fill out "TODOs" in List of Datasets cleong110/sign-language-processing.github.io#5

Open

19 tasks

cleong110 deleted the dataset/BosphorusSign_update branch June 7, 2024 18:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update BosphorusSign #49

Update BosphorusSign #49

cleong110 commented May 25, 2024

AmitMY May 26, 2024

cleong110 May 27, 2024 •

edited

Loading

cleong110 May 27, 2024

cleong110 May 27, 2024 •

edited

Loading

cleong110 May 27, 2024

cleong110 May 27, 2024

AmitMY May 28, 2024

cleong110 May 28, 2024

Update BosphorusSign #49

Update BosphorusSign #49

Conversation

cleong110 commented May 25, 2024

AmitMY May 26, 2024

Choose a reason for hiding this comment

cleong110 May 27, 2024 • edited Loading

Choose a reason for hiding this comment

cleong110 May 27, 2024

Choose a reason for hiding this comment

cleong110 May 27, 2024 • edited Loading

Choose a reason for hiding this comment

cleong110 May 27, 2024

Choose a reason for hiding this comment

cleong110 May 27, 2024

Choose a reason for hiding this comment

AmitMY May 28, 2024

Choose a reason for hiding this comment

cleong110 May 28, 2024

Choose a reason for hiding this comment

cleong110 May 27, 2024 •

edited

Loading

cleong110 May 27, 2024 •

edited

Loading