Thursday, March 31, 2005

One article about MAT

I find this one in Proquest database.

Multidimensional adaptive testing using the weighted likelihood estimation.
by Tseng, Fen-Lan, Ph.D., University of Pittsburgh, 2000, 156 pages; AAT 9998631
Advisor: Hsu, Tse-chi
School: University of Pittsburgh
School Location: United States -- Pennsylvania
Index terms(keywords): Multidimensional adaptive testing, Weighted likelihood estimation, Item response theory, Estimation
Source: DAI-A 61/12, p. 4746, Jun 2001
Source type: DISSERTATION
Subjects: Educational evaluation, Educational psychology, Psychological tests
Publication Number: AAT 9998631
ISBN: 0493077332
Document URL: http://proquest.umi.com/pqdweb?did=727870381&sid=18&Fmt=2&clientId=43390&RQT=309&VName=PQD
ProQuest document ID: 727870381
More Like This »Show Options for finding similar documents

Abstract (Document Summary)

This study extended Warm's (1989) weighted likelihood estimation (WLE) to a multidimensional computerized adaptive test (MCAT) setting. WLE was compared with the maximum likelihood estimation (MLE), expected a posteriori (EAP), and maximum a posteriori (MAP) using a three-dimensional 3PL IRT model under a variety of computerized adaptive testing conditions. The dependent variables included bias, standard error of ability estimates (SE), square root of mean square error (RMSE), and test information. The independent variables were ability estimation methods, intercorrelation levels between dimensions, multidimensional structures, and ability combinations. Simulation results were presented in terms of descriptive statistics, such as figures and tables. In addition, inferential procedures were used to analyze bias by conceptualizing this Monte Carlo study as a statistical sampling experiment.

The results of this study indicate that WLE and the other three estimation methods yield significantly more accurate ability estimates under an approximate simple test structure with one dominant dimension and several secondary dimensions. All four estimation methods, especially WLE, yield very large SEs when a three equally dominant multidimensional structure was employed. Consistent with previous findings based on unidimensional IRT model, MLE and WLE are less biased in the extreme of the ability scale; MLE and WLE yield larger SEs than the Bayesian methods; test information-based SEs underestimate actual SEs for both MLE and WLE in MCAT situations, especially at shorter test lengths; WLE reduced the bias of MLE under the approximate simple structure; test information-based SEs underestimates the actual SEs of MLE and WLE estimators in the MCAT conditions, similar to the findings of Warm (1989) in the unidimensional case.

The results from the MCAT simulations did show some advantages of WLE in reducing the bias of MLE under the approximate simple structure with a fixed test length of 50 items, which was consistent with the previous research findings based on different unidimensional models. It is clear from the current results that all four methods perform very poorly when the multidimensional structures with multiple dominant factors were employed. More research efforts are urged to investigate systematically how different multidimensional structures affect the accuracy and reliability of ability estimation. Based on the simulated results in this study, there is no significant effect found on the ability estimation from the intercorrelation between dimensions.

Thursday, March 24, 2005

Simulation in CAT research

Two primary simulation means can be found often: 1) real-data or post hoc; 2) Monte Carlo simulation.

Real-data or post hoc simulation is used when CAT is to be used to reduce the length of a test that has been administered conventionally. The "item bank" used in this case would be all the items in a conventional test. The objective of applying CAT procedures is to determine how much reduction in test length can be achieved by "re-administering" the items adaptively, without significant changes in the psychometric properties of the test scores. The data are the item responses of a group of examinees on the conventional test that is being analyzed.

Monte carlo simulation can be used to evaluate the potential performance of various approaches to CAT with various populations and to evaluate the potential performance of CAT using using real or hypothetical item banks. This approach to simulation typically is used prior to the implementation of a live CAT testing program to evaluate the performance of a calibrated item bank and to determine operational CAT parameters, such as appropriate values of CAT entry theta values, termination criteria, and item exposure.

Monte carlo simulation differs from real-data simulation in the following characteristics:

  1. "Examinees" (generally referred to as "simulees") are generated by the simulation process to have specified distributions of theta.
  2. Item parameters can either be generated to have specified values and distributions, or item parameter estimates from real items can be used as the CAT item bank.
  3. Item responses of the simulees are generated from an appropriate IRT model.
  4. CATs are then administered using prespecified CAT algorithm values to answer a specific research question or set of research questions.

Saturday, March 05, 2005

Excellent Labs of Learning

I will search for excellent labs, especially their interesting and meaningful projects, and put them here for frequent reference. Hope you can enjoy and add more.

Today, it is NASA's Automated Learning Group (http://ic-www.arc.nasa.gov/ic/projects/bayes-group/index.html)
"The goal of the Automated Learning Group (previously known as the Bayesian Model-Based Learning or Bayes Group) is to develop the basic theory and associated algorithms for various kinds of general data analysis techniques. The Automated Learning Group is a research and development effort within the Computational Sciences Division at NASA Ames Research Center."

Friday, March 04, 2005

Bayesian Networks-based CAT

Bayesian Networks can be used for various field to help people make decision based on observed a-prior information. In educational testing, a student model (SM) can be built using Bayesian Networks theory, which can be used to measure student latent abilities, unidimensional as well as multidimensional, according to their responses to designed items.