Computerized Adaptive Testing: Dimensionality?!

Monday, February 07, 2005

Dimensionality?!

I don't know when we began talked about Unidimensionality in measurement. I guess that the direct cause to it is that we, in an simple way, thought that one dimension was easier to process. With time on, we notice multidimensionality hidden in the application. But how can we determine between them to which one problem belongs. For example, a mathematics exam written in English will be unidimensional for English-spoken examinees, but at least 2 dimensions for Chinese students. Here, I prefer the view of Mark Reckase and Terry Ackerman, who stressed that the dimensionality lies in the interaction between the test structure (given by the item response functions; i.e., IRFs) and the latent ability structure (given by the latent ability examinee population distribution).

7 Comments:

At 6:48 PM, Shunkai Fu said...: Accessing dimensinoality is the fundamental question when we step from simple world to complex one. Distinguishing one enemy from another is a confusing process, but we have to even we hate them all since both of them are enemies.
At 2:47 PM, Anonymous said...: lg, Ip alwarys supports you
At 3:18 AM, Anonymous said...: Hai Shun kai Fu,

Iam student ,iam doing a project based on CAT ,could you please give us some idea,that how do we decide upon item banks and how the item banks are maintained
At 7:28 AM, Shunkai Fu said...: Hi, thanks for your visiting of my blog.

For research project on CAT, at least we need the following components: 1)calibrated item bank; 2)selection rule; 3)stopping rule.

How do you choose them depends on your choice of model. My teamates use Bayesian Networks, and I use Bayesian decision theory. And we are all non-IRT model.

For item bank as you mentioned, it comes from traditional testing. You can borrow data of traditional paper-based test. For example, I got data from French department since they have more data considering intensive tests in language teaching. Note that the data you will get is just a 2D data: the row is examinees' response (1 for right, 0 for wrong), and the column is item list. So, you just know a population's response vectors to a list of items. Your calibration will lets you have information on the population and items.

After your calibrated item bank, you can simulate them on your CAT model. You present item, wait for response (in fact it is availabe: 0 or 1), select new item based on right or wrong response. It is an iterative procedure.

You can compare your CAT performance with paper-based test result, assuming their scoring system is the same (for example, 1 credit for each right response).

I don't know if they are clear enought. If not, pls leave your question, or contact with me by: shukai.fu@polymtl.ca
At 7:32 AM, Shunkai Fu said...: Sorry, forget the item bank maintanance.

As we all know, there is one section in GRE will not be charged. Its purpose is to get calibrated item freely (hate ETS?). Each item in the bank of CAT need be calibrated in advance since we need the corresponding parameters for selection. During your research work, just forget that. In fact, from my point of view, the ideal item bank will be updated periodly, and the difficulty distribution need follow Bell Shape.
At 9:01 AM, Anonymous said...: Hai Shun Kai Fu,

THANKS a lot for your reply.
Actually,we are only doing the item bank creation for CAT.

we have decided to maintain a pool of items (calibrated) from which the items could be selected and placed in the item bank(wanted to have a hierarchial bank structure).is it ok to have this way or is there any better way of maintaining the items.

my doubt is,whether the items should be stored directly in the bank or in the pool
thank you once again.
At 2:00 PM, Shunkai Fu said...: Hi, I think hierarchial bank structure is a reasonable way. As we know that EST will update their item bank regularly (or periodically) to prevent item exposure, and I believe they should have the corresponding rule to do that.

I don't know if your hierarchial structure is based on chronical or difficulty level, or any other index. Either I don't know if you care of item exposure problem. All these are influenced by your purpose of CAT.

Computerized Adaptive Testing

Monday, February 07, 2005

Dimensionality?!

7 Comments:

About Me

Previous Posts