Miller Analogies Test Report (MAT)

Psychological Assessment I, LAP 501

Dr. Karen Jaffe

Spring 1997

Presentation

Developed by W. S. Miller at the University of Minnesota in 1926, the MAT is a mental ability test to used differentiate between high-ability applicants (Ivens). 100 multiple-choice items are presented in the form of analogies with 4 possible responses (Ivens). Subject matter is classified into 9 categories: language usage, math, physical science, history, biological sciences, social sciences, literature-philosophy, fine arts, and general information (Ivens). It proports to measure adequacy of educational background and problem solving ability (Frary).

A. Describe Standardization sample.

The norms are limited in that they suffer from lack of specificity, especially in disciplines as the natural sciences or humanities, doctoral and masters candidates were combined, the norms only relate to applicants who elected to take the MAT, not to all those anticipating graduate study (Frary).

New norming was based on all domestic examinees who took the test for the first time in 1991-1992 (Frary). Categories were then developed according to projected major (Frary). The scores are presented containing percentile ranks using 5 point intervals (Frary).

Compared to the 1981 edition, the 1994 manual contains quite a bit more verbiage about score interpretation much of which is appropriately cautionary regarding things such as: no fixed cut of point for admission, using a variety of evidence to establish candidates adequacy, consideration of standard error of measurement, and that the norms were based on self-selected examinees (Frary).

Summary statistics on the 1991-1992 examinees could not be ascertained from the data the authors have provided, making one wonder whether group x form analysis of gender and ethnicity was done, which would have been useful (Frary).

In regard to the 1991-1992 norm establishment procedure (as already described by Frary), Iven says these reference groups aren’t technically norms and the publisher cautions users that the percentiles may not be representative of all graduate applicants, similar cautions are NOT clearly stated in materials for examinees (Ivens).

In The Candidate Information Booklet of 1991, examinees are encouraged to eliminate options before guessing and to guess even if baffled (Frary). However, in the actual test instructions they’re told, “If you are not sure of an answer mark the response you think is correct (Frary).

The Guide for Controlled Testing Centers and Directions for Examiners offer clear and prescriptive seating arrangement, and test forms, leave nothing to chance and ensuring a level playing field (Ivens).

B. What is the Reliability? Is it test/retest, split half or other?

Frary says reliability is good, but says that previous reviewers found a problem with norming scores and with the methodology for equating its forms (Frary). Seven forms of test allow for retesting candidates (Frary). Alternative Form reliability utilized.

A new section of the 1994 technical manual contains data concerning score changes on retake (Frary). A substantial portion of the retestees are retaking due to getting a lower score the first time, so it may be a motivating factor causing a 5-6 point increase in retake scores on average (Frary). This remained consistent across time between initial and subsequent retesting with presumably a different forms of the test. (Frary). The correlation between the 2 scores, a sort of hybrid test-retest/parallel-form reliability coefficient, was only 0.73 which is much lower than the KR-20’s of over 0.9 reported on various forms of the MAT (Frary). Discrepancy no doubt reflects variation in time differences between retesting as well as content difference between forms (Frary). The effect of coaching wasn’t mentioned (Frary). Both the 1981 and the 1994 technical manual, Form V21, appears to be the test to because it appears to be much easier than others. One should expect better from the publishers (Frary).

Ivens criticizes a publication entitled, Miller Analogies Test Technical Manual—A Guide to Interpretation which is supposed to help university deans, faculty and administrators with their decision making process (Ivens). The publisher says the various test forms are similar, thus are comparable. However, comparable does NOT mean equivalent and the publisher is remiss in not providing empirical evidence (Ivens). Surprisingly the manual does not address race, ethnic or gender bias except through uninterpreted tables (Ivens).

C. What is the Validity? If there is none, what explanation is given?

Frary’s assessment concluded that the MAT’s content was well constructed, but does offer that other reviewers found validity inadequate as presented in the technical manual (Frary).

The 1981 technical manual provided over 40 corrections between MAT scores and graduate GPAs (Frary). This was criticized for failure to specify circumstances underlying each coefficient (homogeneity of the groups, difficulty of courses, etc.) (Frary). The new technical manual simply aggregates data collected in 1992 from over 50 grad departments that had at least 8 students with MAT scores, undergraduate GPAs and first year grad GPAs (Frary). The reader is left in the dark about the number and characteristic of universities involved, subject areas of departments, the total number of students involved, and the time period represented by the data (Frary). Correlation between MAT scores and first year grad GPA is only 0.23 (Frary). Multiple correlation using undergraduate GPA and MAT scores to predict first year grad GPA is 0.37, whereas the correlation between undergraduate and first year grad GPAs is 0.29 (Frary).

The technical manual contains a warning data may not be representative of all grad departments and points to likely restriction in range of both predictor and criterion due to selection and grad school grading practices (Frary). In other words, accepting the validity of the MAT for informing about admissions decisions is more or less an act of faith (Frary).

Given the MAT’s low predictive power, it’s likely that it’s continued use will be justified as a hands means of checking whether someone has the mental capacities judged adequate for the subject matter at hand (Frary). One could quickly eliminate extremely low scores and spend less time on the qualifications of candidates with high scores (Frary). Purist would denounce this, but if used conservatively it’s very unlikely to cause inappropriate or unfair admit decisions (Frary).

The concept of validity is discussed but the evidence of content validity is not presented (Ivens). One correlation matrix based on aggregated data from over 50 graduate schools/departments is presented as evidence of predictive validity (Ivens). The correlations (already stated by Frary), mean that the resulting incremental validity for the MAT over undergraduate GPA is 0.08 (Ivens). Grade inflation, restriction in range, and inconsistencies in grading practices within and across schools and departments, make GPA a less than ideal criterion variable to predict (Ivens). The MAT would have been better served had the publisher disaggregated the data and displayed predictive validity coefficients separately by school/department (Ivens).Ivens warns that the onus of confirming the tests assumptions rests with each school/department that requires the test (Ivens).

The publisher says, “An applicant to graduate school will typically have been exposed too much, if not all, of the information necessary to complete each analogy” (manual p. 4) (Ivens). However, as many examinees will attest, exposure to the necessary information is not a sufficient condition for successful completion (Ivens). The cognitive complexity of the items is due in large part to the nature of the relationship among word pairs (Ivens). The 50 minute time limitation encourages people to select the first plausible answer, if not done carefully will be wrong (Ivens).

The Candidate Information Booklet says that fluency with the English language, broad knowledge of the topics (included above), and the ability to reason out relationships, may contribute to one’s results (Ivens). Although a minor point, the auxiliary verb “may” is most curious because if the aforementioned factors don’t help with performance – what does (Ivens)?

D. Give Revision information.

The MAT was first published in 1926 and has undergone revisions up until 1992. The technical manual was completely revised and published in 1994 with little change in the problems discussed above, except for some improved norming (Frary).

E. Who may Administer this test and under what circumstances?

The test distribution is restricted and is administered to a group in a licensed test center (Frary). The test can be scored on site to facilitate rapid results and decision making.

Frary, Robert. B. Buros. pp. 617-619.

Ivens, Stephen H. Buros. pp. 619-620.

F. Your Thoughts on this test.

I immediately picked up on the tests lacking face validity.

Despite the fact that I had already done very well in actual graduate school courses, I failed the test dreadfully. My real-life experience with the MAT validates the assessment of both reviewers. It’s not a worthwhile tool in predicting graduate school performance.

Miller Analogies Test Report (MAT)

Published by

trishandersonlcpc@yahoo.com

Leave a Reply