Aim: The purpose of this study was to assess the performance of four mortality prediction systems and their ability to predict outcome in our population. Methods: A total of 200 patients, who stayed in the intensive care unit for more than 72 hours, were included in the study. Burn patients, brain-dead patients and those younger than 16 years were excluded. APACHE II, SAPS II, and LODS scores were calculated in accordance with the original methodology, using the worst physiologic values in the first 24 hours. TRIOS was calculated in accordance with the original methodology, TRIOS, SAPS II and LODS were calculated on the second and third day of intensive care unit stay as well, using the worst physiologic values in each day. Predicted mortality was calculated using the original regression formulas. The performance of each scoring systems were assessed with calibration and discrimination measures. Results: Calculated standardized mortality rates were: 1.30 for APACHE II, 1.22 for SAPS II, 1.07 for LODS and 1.13 for TRIOS. For all systems, p values calculated using the Hosmer-Lemeshow Goodness-of-Fit C test were greater than 0.05 indicating a good fit. Area under the ROC curve were greater than 0.80 for each scoring system. The level of calibration was acceptable for the systems but they failed to predict the actual mortality. All the systems showed reasonable discrimination, they were able to distinguish between survivors and non-survivors. According to our results, none of these systems showed superiority over others. Conclusion: We concluded that any of the four systems could be used to predict outcome in our population taking account of the differences between our population and the original populations which the scoring systems were developed.