Dienst van SURF
© 2025 SURF
ObjectiveTo compare estimates of effect and variability resulting from standard linear regression analysis and hierarchical multilevel analysis with cross-classified multilevel analysis under various scenarios.Study design and settingWe performed a simulation study based on a data structure from an observational study in clinical mental health care. We used a Markov chain Monte Carlo approach to simulate 18 scenarios, varying sample sizes, cluster sizes, effect sizes and between group variances. For each scenario, we performed standard linear regression, multilevel regression with random intercept on patient level, multilevel regression with random intercept on nursing team level and cross-classified multilevel analysis.ResultsApplying cross-classified multilevel analyses had negligible influence on the effect estimates. However, ignoring cross-classification led to underestimation of the standard errors of the covariates at the two cross-classified levels and to invalidly narrow confidence intervals. This may lead to incorrect statistical inference. Varying sample size, cluster size, effect size and variance had no meaningful influence on these findings.ConclusionIn case of cross-classified data structures, the use of a cross-classified multilevel model helps estimating valid precision of effects, and thereby, support correct inferences.
MULTIFILE
Machine learning models have proven to be reliable methods in classification tasks. However, little research has been conducted on the classification of dwelling characteristics based on smart meter and weather data before. Gaining insights into dwelling characteristics, which comprise of the type of heating system used, the number of inhabitants, and the number of solar panels installed, can be helpful in creating or improving the policies to create new dwellings at nearly zero-energy standard. This paper compares different supervised machine learning algorithms, namely Logistic Regression, Support Vector Machine, K-Nearest Neighbor, and Long-short term memory, and methods used to correctly implement these algorithms. These methods include data pre-processing, model validation, and evaluation. Smart meter data, which was used to train several machine learning algorithms, was provided by Groene Mient. The models that were generated by the algorithms were compared on their performance. The results showed that the Long-short term memory performed the best with 96% accuracy. Cross Validation was used to validate the models, where 80% of the data was used for training purposes and 20% was used for testing purposes. Evaluation metrics were used to produce classification reports, which indicates that the Long-short term memory outperforms the compared models on the evaluation metrics for this specific problem.
Machine learning models have proven to be reliable methods in classification tasks. However, little research has been done on classifying dwelling characteristics based on smart meter & weather data before. Gaining insights into dwelling characteristics can be helpful to create/improve the policies for creating new dwellings at NZEB standard. This paper compares the different machine learning algorithms and the methods used to correctly implement the models. These methods include the data pre-processing, model validation and evaluation. Smart meter data was provided by Groene Mient, which was used to train several machine learning algorithms. The models that were generated by the algorithms were compared on their performance. The results showed that Recurrent Neural Network (RNN) 2performed the best with 96% of accuracy. Cross Validation was used to validate the models, where 80% of the data was used for training purposes and 20% was used for testing purposes. Evaluation metrices were used to produce classification reports, which can indicate which of the models work the best for this specific problem. The models were programmed in Python.