Service of SURF
© 2025 SURF
Objective:Acknowledging study limitations in a scientific publication is a crucial element in scientific transparency and progress. However, limitation reporting is often inadequate. Natural language processing (NLP) methods could support automated reporting checks, improving research transparency. In this study, our objective was to develop a dataset and NLP methods to detect and categorize self-acknowledged limitations (e.g., sample size, blinding) reported in randomized controlled trial (RCT) publications.Methods:We created a data model of limitation types in RCT studies and annotated a corpus of 200 full-text RCT publications using this data model. We fine-tuned BERT-based sentence classification models to recognize the limitation sentences and their types. To address the small size of the annotated corpus, we experimented with data augmentation approaches, including Easy Data Augmentation (EDA) and Prompt-Based Data Augmentation (PromDA). We applied the best-performing model to a set of about 12K RCT publications to characterize self-acknowledged limitations at larger scale.Results:Our data model consists of 15 categories and 24 sub-categories (e.g., Population and its sub-category DiagnosticCriteria). We annotated 1090 instances of limitation types in 952 sentences (4.8 limitation sentences and 5.5 limitation types per article). A fine-tuned PubMedBERT model for limitation sentence classification improved upon our earlier model by about 1.5 absolute percentage points in F1 score (0.821 vs. 0.8) with statistical significance (). Our best-performing limitation type classification model, PubMedBERT fine-tuning with PromDA (Output View), achieved an F1 score of 0.7, improving upon the vanilla PubMedBERT model by 2.7 percentage points, with statistical significance ().Conclusion:The model could support automated screening tools which can be used by journals to draw the authors’ attention to reporting issues. Automatic extraction of limitations from RCT publications could benefit peer review and evidence synthesis, and support advanced methods to search and aggregate the evidence from the clinical trial literature.
MULTIFILE
An introduction of the Hybrid Publications group of the research project Going Hybrid. The blog-post discusses affiliated researchers and organizations, working questions, expert opinions, and previous work done by partners.
LINK
Objective: To annotate a corpus of randomized controlled trial (RCT) publications with the checklist items of CONSORT reporting guidelines and using the corpus to develop text mining methods for RCT appraisal. Methods: We annotated a corpus of 50 RCT articles at the sentence level using 37 fine-grained CONSORT checklist items. A subset (31 articles) was double-annotated and adjudicated, while 19 were annotated by a single annotator and reconciled by another. We calculated inter-annotator agreement at the article and section level using MASI (Measuring Agreement on Set-Valued Items) and at the CONSORT item level using Krippendorff's α. We experimented with two rule-based methods (phrase-based and section header-based) and two supervised learning approaches (support vector machine and BioBERT-based neural network classifiers), for recognizing 17 methodology-related items in the RCT Methods sections. Results: We created CONSORT-TM consisting of 10,709 sentences, 4,845 (45%) of which were annotated with 5,246 labels. A median of 28 CONSORT items (out of possible 37) were annotated per article. Agreement was moderate at the article and section levels (average MASI: 0.60 and 0.64, respectively). Agreement varied considerably among individual checklist items (Krippendorff's α= 0.06–0.96). The model based on BioBERT performed best overall for recognizing methodology-related items (micro-precision: 0.82, micro-recall: 0.63, micro-F1: 0.71). Combining models using majority vote and label aggregation further improved precision and recall, respectively. Conclusion: Our annotated corpus, CONSORT-TM, contains more fine-grained information than earlier RCT corpora. Low frequency of some CONSORT items made it difficult to train effective text mining models to recognize them. For the items commonly reported, CONSORT-TM can serve as a testbed for text mining methods that assess RCT transparency, rigor, and reliability, and support methods for peer review and authoring assistance. Minor modifications to the annotation scheme and a larger corpus could facilitate improved text mining models. CONSORT-TM is publicly available at https://github.com/kilicogluh/CONSORT-TM.
Program Director MSc Leisure and Tourism Studies, Senior Researcher and Lecturer
Digital transformation has been recognized for its potential to contribute to sustainability goals. It requires companies to develop their Data Analytic Capability (DAC), defined as their ability to collect, manage and analyze data effectively. Despite the governmental efforts to promote digitalization, there seems to be a knowledge gap on how to proceed, with 37% of Dutch SMEs reporting a lack of knowledge, and 33% reporting a lack of support in developing DAC. Participants in the interviews that we organized preparing this proposal indicated a need for guidance on how to develop DAC within their organization given their unique context (e.g. age and experience of the workforce, presence of legacy systems, high daily workload, lack of knowledge of digitalization). While a lot of attention has been given to the technological aspects of DAC, the people, process, and organizational culture aspects are as important, requiring a comprehensive approach and thus a bundling of knowledge from different expertise. Therefore, the objective of this KIEM proposal is to identify organizational enablers and inhibitors of DAC through a series of interviews and case studies, and use these to formulate a preliminary roadmap to DAC. From a structure perspective, the objective of the KIEM proposal will be to explore and solidify the partnership between Breda University of Applied Sciences (BUas), Avans University of Applied Sciences (Avans), Logistics Community Brabant (LCB), van Berkel Logistics BV, Smink Group BV, and iValueImprovement BV. This partnership will be used to develop the preliminary roadmap and pre-test it using action methodology. The action research protocol and preliminary roadmap thereby developed in this KIEM project will form the basis for a subsequent RAAK proposal.
Aanleiding Nieuwsuitgeverijen bevinden zich in zwaar weer. Economische malaise en toegenomen concurrentie in het pluriforme medialandschap dwingen uitgeverijen om enerzijds kosten te besparen en tegelijkertijd te investeren in innovatie. De verdere automatisering van de nieuwsredactie vormt hierbij een uitdaging. Buiten de branche ontstaan technieken die uitgeverijen hierbij zouden kunnen gebruiken. Deze zijn nog niet 'vertaald' naar gebruiksvriendelijke systemen voor redactieprocessen. De deelnemers aan het project formuleren voor dit braakliggend terrein een praktijkgericht onderzoek. Doelstelling Dit onderzoek wil antwoord geven op de vraag: Hoe kunnen bewezen en nieuw te ontwikkelen technieken uit het domein van 'natural language processing' een bijdrage leveren aan de automatisering van een nieuwsredactie en het journalistieke product? 'Natural language processing' - het automatisch genereren van taal - is het onderwerp van het onderzoek. In het werkveld staat deze ontwikkeling bekend als 'automated journalism' of 'robotjournalistiek'. Het onderzoek richt zich enerzijds op ontwikkeling van algoritmes ('robots') en anderzijds op de impact van deze technologische ontwikkelingen op het nieuwsveld. De impact wordt onderzocht uit zowel het perspectief van de journalist als de nieuwsconsument. De projectdeelnemers ontwikkelen binnen dit onderzoek twee prototypes die samen het automated-journalismsysteem vormen. Dit systeem gaat tijdens en na het project gebruikt worden door onderzoekers, journalisten, docenten en studenten. Beoogde resultaten Het concrete resultaat van het project is een prototype van een geautomatiseerd redactiesysteem. Verder levert het project inzicht op in de verankering van dit soort systemen binnen een nieuwsredactie. Het onderzoek biedt een nieuw perspectief op de manier waarop de nieuwsconsument de ontwikkeling van 'automated journalism' in Nederland waardeert. Het projectteam deelt de onderzoekresultaten door middel van presentaties voor de uitgeverijbranche, presentaties op wetenschappelijke conferenties, publicaties in (vak)tijdschriften, reflectiebijeenkomsten met collega-opleidingen en een samenvattende white paper.
Physical rehabilitation programs revolve around the repetitive execution of exercises since it has been proven to lead to better rehabilitation results. Although beginning the motor (re)learning process early is paramount to obtain good recovery outcomes, patients do not normally see/experience any short-term improvement, which has a toll on their motivation. Therefore, patients find it difficult to stay engaged in seemingly mundane exercises, not only in terms of adhering to the rehabilitation program, but also in terms of proper execution of the movements. One way in which this motivation problem has been tackled is to employ games in the rehabilitation process. These games are designed to reward patients for performing the exercises correctly or regularly. The rewards can take many forms, for instance providing an experience that is engaging (fun), one that is aesthetically pleasing (appealing visual and aural feedback), or one that employs gamification elements such as points, badges, or achievements. However, even though some of these serious game systems are designed together with physiotherapists and with the patients’ needs in mind, many of them end up not being used consistently during physical rehabilitation past the first few sessions (i.e. novelty effect). Thus, in this project, we aim to 1) Identify, by means of literature reviews, focus groups, and interviews with the involved stakeholders, why this is happening, 2) Develop a set of guidelines for the successful deployment of serious games for rehabilitation, and 3) Develop an initial implementation process and ideas for potential serious games. In a follow-up application, we intend to build on this knowledge and apply it in the design of a (set of) serious game for rehabilitation to be deployed at one of the partners centers and conduct a longitudinal evaluation to measure the success of the application of the deployment guidelines.