OBJECTIVE: Clinical scales evaluating arm function after stroke are weak at detecting quality of movement. Therefore a new scale, the Motor Evaluation Scale for Upper Extremity in Stroke Patients (MESUPES), was developed, comprising 22 items pertaining to arm and hand performance. The scale was investigated for validity and unidimensionality using the Rasch measurement model, and for inter-rater reliability. SETTING: Twelve hospitals and rehabilitation centres in Belgium, Germany and Switzerland. PATIENTS: There were 396 patients (average age 63.38+/-12.89 years) in the Rasch study and 56 patients (average age 65.68+/-12.75 years) in the reliability study. MAIN MEASURES: The scale was examined on its fit to the Rasch model, thereby evaluating the scale's unidimensionality and validity. Differential item functioning was performed to test the stability of item hierarchy on several variables. Inter-rater reliability was examined with kappa values, weighted percentage agreement and intraclass correlation coefficients (ICC). RESULTS: Based on Rasch analysis, five items were removed. The MESUPES was divided in two tests: the MESUPES-arm test (8 items) and MESUPES-hand test (9 items). Both scales fitted the Rasch model. All items were stable among the subgroups of the sample. ICCs were 0.95 (95% confidence interval (CI) 0.91 -0.97) and 0.97 (95% CI 0.95-0.98) for the total score on arm and hand test respectively. The scale was also reliable at item level (weighted kappa 0.62 -0.79, weighted percentage agreement 85.71 -98.21). CONCLUSION: The MESUPES-arm and MESUPES-hand meet the statistical properties of reliability, validity and unidimensionality. Both tests provide a useful clinical and research tool to qualitatively evaluate arm and hand function during recovery after stroke.