BSCII#

Beijing Sentence Corpus II

BSCII dataset [Yan et al., 2025].

The Beijing Sentence Corpus II (BSCII) is a Traditional Chinese sentence corpus of eye-tracking data, based on the original Beijing Sentence Corpus (BSC) in Simplified Chinese. Data was collected from 60 native Traditional Chinese readers. The corpus enables analyses of word frequency, visual complexity, and predictability on fixation location and duration.

Since the BSCII sentences are nearly identical to those in the BSC, the two corpora together provide a valuable resource for studying cross-script similarities and differences between Simplified and Traditional Chinese.

Eye-movements were recorded with an Eyelink 1000 system at 1000 Hz.

Check the respective paper for details [Yan et al., 2025].

DatasetDefinition class implementation: BSCII