Main Test Standards: A. Test coverage and usage B. Appropriate samples for test validation C. Reliability D. Predictive validity E. Content validity
A. Test coverage and usage: You must state what the particular test is targeted on, provide the detailed description of usage and assignment. The test should be first of all appropriate for t6arget audience i.e. for students.
Questions to ask are: 1. What is the usage of the test? What types of test are required to be created? Are foreseeable inappropriate applications identified? 2. What is the target audience? For whom the particular test is designed?
B. Appropriate samples for test validation. The samples used for test validation must be of adequate size to establish appropriate norms, and to support conclusions regarding the use of the instrument for the intended purpose.
Questions to ask are:
1. How were the samples used in pilot testing, validation chosen? How is this sample related to the population of students? Were participation rates appropriate? Can you draw meaningful comparisons of your students and these students? 2. Was the number of test participants sufficient in order to develop stable calculations with minimal variation due to sampling errors? Where statements are made concerning subgroups, is the number of test-takers in each subgroup adequate? 3. Do the difficulty levels of the test and criterion measures (if any) provide an adequate basis for validating the instrument? Are there sufficient variations in test scores? 4. Are the test made according to the publishing norms?
C. Reliability
The test must ensure to represent the correct results of under test ability. You should also find out the precinct after which the emotional factors often influence the test taker behavior and test results.
Different types of reliability estimates should be used to estimate the contributions of different sources of measurement error. Inter-rater reliability coefficients provide estimates of errors due to inconsistencies in judgment between raters.
Questions to ask are: 1. Have appropriate types of reliability estimates have been reckoned? Have appropriate statistics been used to compute these estimates? 2. What are the reliabilities of the test for different target audience groups? How were they calculated? 3. Is the reliability sufficiently high to guarantee the use of the test as a basis for making decisions concerning the particular students?
|