Dienst van SURF
© 2025 SURF
In this paper we analyse the way students tag recorded lectures. We compare their tagging strategy and the tags that they create with tagging done by an expert. We look at the quality of the tags students add, and we introduce a method of measuring how similar the tags are, using vector space modelling and cosine similarity. We show that the quality of tagging by students is high enough to be useful. We also show that there is no generic vocabulary gap between the expert and the students. Our study shows no statistically significant correlation between the tag similarity and the indicated interest in the course, the perceived importance of the course, the number of lectures attended, the indicated difficulty of the course, the number of recorded lectures viewed, the indicated ease of finding the needed parts of a recorded lecture, or the number of tags used by the student.
LINK
This paper describes experiments with a game device that was used for early detection of delays in motor skill development in primary school children. Children play a game by bi-manual manipulation of the device which continuously collects ac- celerometer data and game state data. Features of the data are used to discriminate between normal children and children with delays. This study focused on the feature selection. Three features were compared: mean squared jerk (time domain); power spectral entropy (fourier domain) and cosine similarity measure (quality of game play). The discriminatory power of the features was tested in an experiment where 28 children played games of different levels of difficulty. The results show that jerk and cosine similarity have reasonable discriminatory power to detect fine-grained motor skill development delays especially when taking the game level into account. Duration of a game level needs to be at least 30 seconds in order to achieve good classification results.
Preprint submitted to Information Processing & Management Tags are a convenient way to label resources on the web. An interesting question is whether one can determine the semantic meaning of tags in the absence of some predefined formal structure like a thesaurus. Many authors have used the usage data for tags to find their emergent semantics. Here, we argue that the semantics of tags can be captured by comparing the contexts in which tags appear. We give an approach to operationalizing this idea by defining what we call paradigmatic similarity: computing co-occurrence distributions of tags with tags in the same context, and comparing tags using information theoretic similarity measures of these distributions, mostly the Jensen-Shannon divergence. In experiments with three different tagged data collections we study its behavior and compare it to other distance measures. For some tasks, like terminology mapping or clustering, the paradigmatic similarity seems to give better results than similarity measures based on the co-occurrence of the documents or other resources that the tags are associated to. We argue that paradigmatic similarity, is superior to other distance measures, if agreement on topics (as opposed to style, register or language etc.), is the most important criterion, and the main differences between the tagged elements in the data set correspond to different topics