實作評量：素養評量 ──教學評量專題研究報告

實作評量：素養評量 ──教學評量專題研究報告
指導教授：余民寧老師報告人：謝智如

報告大綱一、素養評量(literacy assessment)相關議題與應用
二、期刊研究報告： Assessment literacy and student learning: the case for explicitly developing students ‘assessment literacy’

一、素養評量 (literacy assessment)
相關議題與應用

什麼是素養？(Literacy) Literacy is traditionally understood as the ability to read and write. The term's meaning has been expanded to include the ability to use language, numbers, images and other means to understand and use the dominant symbol systems of a culture. The concept of literacy is expanding in OECD countries to include skills to access knowledge through technology and ability to assess complex contexts.

什麼是素養？(Literacy) The United Nations Educational, Scientific and Cultural Organization (UNESCO) defines literacy as the "ability to identify, understand, interpret, create, communicate and compute, using printed and written materials associated with varying contexts. Literacy involves a continuum of learning in enabling individuals to achieve their goals, to develop their knowledge and potential, and to participate fully in their community and wider society"

素養和能力的不同素養（literacy）或能力（competence）所指稱的內容眾說紛紜，但素養包括具備基本學科知識，將其運用在生活以及工作的問題解決能力，除了能力的考量外，也包括運用學科知識解決問題的習慣與態度，因此，素養包括能力以及態度兩個層次，更強調整合性的知識，以及解決真實問題的習慣與態度。

各式素養(many other new literacies)
閱讀素養(reading literacy) 科學素養(scientific literacy) 數學素養(mathematical literacy) 資訊素養、媒體素養(information, media) 環境素養(environmental literacy) 健康素養(health literacy) 價值素養(value literacy) 多元文化素養(multicultural literacy)

素養評量依據素養的定義，素養評量應屬實作評量的範疇。
實作評量：強調實際的表現行為（actual performance），都需要教師根據學生的表現過程之有效性或最後完成作品的成果品質，分別或合併地進行判斷（或評分），以決定學生在這方面學習的成就高低。如概念應用：應用所學的概念和知識解決日常生活所遇到的實際問題等

素養評量 IEA：國際數學與科學成就趨勢調查（Progress in International Reading Literacy Study，簡稱TIMSS） OECD：國際學生能力評量計劃( Programme for International Student Assessment，簡稱 PISA)，評量閱讀、數學和科學的素養程度 IEA：促進國際閱讀素養研究（Progress in International Reading Literacy Study，簡稱 PIRLS）

PISA PISA研究三年一次，宗旨是針對十五歲學生，生活知能的學習成效提供跨國際比較，以及各國教育效能的分析，由此界定國民素養的內涵。
試題設計著重在情境應用，不設限於課程內容，學生須理解資訊靈活運用統整、評鑑、省思能力，自行建構問題情境的答案。評量焦點在於能否使用習得知識技能，面對真實世界的挑戰，而非僅是精熟學校課程。評量內容涵蓋閱讀，數學和科學三個領域的素養程度。革新的素養概念結合終身學習的理念，以成人生活所需的重要知能為主軸，包含正式與非正式的環境，諸如正規課程，課外社團，家庭環境，學校氣氛等。

PISA評量學科週期排列 2000年閱讀為主科；科學和數學為輔 2003年數學為主科；閱讀和科學為輔 2006年科學為主科；閱讀和數學為輔
2009年又回到閱讀主科，科學和數學為輔 2012年則是數學為主科，閱讀和科學為輔，另加測線上問題解決能力 (Problem Solving)

PISA評量學科週期排列

PISA 2006科學素養評量 PISA 2006 的科學素養透過四個向度（情境、能力、知識、態度）來反映科學素養（Scientific literacy）主要在於測驗學生應用科學知識的能力。將物理、化學、生物及地球科學應用到各個題目當中，以獲得新的科學知識、解釋科學現象、用證據解讀科學相關議題及科學與技術的關係。

PISA 2006科學素養評量

態度的評量態度選項（attitudinal item）
大部份 PISA 2006 科學試題都包含了一項新的態度選項，用來連結試題與學生對該科學議題的態度。主要有有二種形式，一是測試學生對於學習科學的興趣，另一是有關於調查學生對於此項科學的支持度（同意度）。這樣的項目會以灰色方框呈現，學生只要依照自己的意思去勾選，這些問題並沒有正確的答案，也不會計算在學生的測驗總分內。這些問題要求學生表示對特定議題的一些陳述的同意程度。對於每項陳述學生應該勾選出一個最能代表自己意見的答案。

態度的評量

態度的評量 PISA 2006 透過二個管道蒐集資料：一透過單獨的態度問卷（questionnaires）蒐集資料，另外也在學生回答能力/知識問題的過程中，透過問題的設計蒐集並整合學生的態度資料。這種情境式的問題使 PISA 2006 可以了解學生對特定科學活動（specific science tasks）的態度。不同於過去利用一般性的態度問卷（general questions about attitudes）得到的結果，PISA 2006 透過這種方法去了解（1）學生的態度是否會因情境的不同而有所改變？（2）學生的態度反應到底是跟哪一題的問題比較相關？還是跟哪一群的題組（groups of questions）比較相關？

PIRLS 閱讀素養評量閱讀是教育的核心，閱讀讓我們透過文字與符號掌握知識，學校裡各學科的知識幾乎都是透過閱讀來學習。理論上，小學三年級是閱讀發展的關鍵期，在此之前是「學會閱讀」（learn to read）的基本能力，包括識字、有基本的文體概念和理解，三年級後是「透過閱讀來學習新知」（read to learn）。因此，PIRLS以小學四年級學生為主要評量對象，檢視他們是否具備了閱讀基本能力，並朝向透過閱讀吸收新知的階段繼續發展。

PIRLS 閱讀素養評量「提升國際閱讀素養研究」（PIRLS）可以瞭解我國的學生閱讀的能力。
與其他國家學生相較之下，我國學生的閱讀能力如何？閱讀成績有沒有進步？我國的四年級學生是否看重閱讀？是否能享受閱讀？我們的學生是否具有能促成讀寫發展的家庭？我國的學校如何規劃閱讀教學？我國教師的教學實務與其他國家相較之下如何？

PIRLS對「閱讀素養」的定義 1.學生能夠理解並運用書寫語言的能力 2.能夠從各式各樣的文章中建構出意義來 3.能夠從閱讀中學習新的事物
4.參與學校及生活中閱讀社群的活動 5.經由閱讀獲得樂趣

閱讀素養評量的內容 PIRLS研究工具：學童所需要閱讀每篇1,200個字至1,600個字之間的文章和回答的題目
針對學生、家長、學校、教師以及課程設計了五種問卷，以便周全瞭解影響閱讀的環境因素。

閱讀素養評量的內容 1.測量學生理解說明文和故事體的能力；理解的評量包括對文章內容與形式的理解。形式問題例如：「有編號的框框怎樣幫助讀者了解文章的內容？寫出其中一個方法。 2.學生的閱讀行為和興趣。 3.家長為學生所提供的閱讀環境，例如：親子共讀頻率和家中圖書等。 4.學校和老師為閱讀所設計的教學和圖書館規劃等。

PIRLS之閱讀理解層次歷程內容說明策略直接歷程直接提取讀者找出文章裡清楚寫出的訊息
直接提取讀者找出文章裡清楚寫出的訊息 1.找出與閱讀目標有關的訊息 2.找出特定觀點 3.搜尋字詞或句子的定義 4.指出故事的場景（例如時間、地點） 5.當文章明顯陳述出來時，找到主題句或主旨直接推論連結文章裡兩項以上的訊息 1.歸論出某事件所導致的另一事件 2.在一串的論點後，歸納出重點 3.找出代名詞與主詞的關係 4.歸納文章的主旨 5.描述人物間的關係解釋歷程詮釋與統整讀者提取先備知識，連結文章裡未清楚明顯表達的訊息 1.清楚分辨出文章整體訊息或主題 2.考慮文中人物可選擇的其他行動 3.比較及對照文章訊息 4.推測故事中的情緒或氣氛 5.詮釋文中訊息在真實世界的適用性檢驗與評估批判與思考文章中的訊息 1.評估文章所描述的事件實際發生的可能性 2.描述作者如何想出讓人出乎意料的結局 3.評斷文章中訊息的完整性 4.找出作者的觀點

PIRLS評量範例─一個不可思議的夜晚學生：閱讀文章、回答問題

PIRLS評分範例─一個不可思議的夜晚評分：依據事先擬定的表現水準描述進行給分

PIRLS 2006台灣學生表現 2006年46國學生參加評比，全球平均分數 500分，台灣學生平均分數為535 分，名列22。
學生在直接歷程的得分（541，通過率73%) 明顯地比在解釋歷程的表現（530，通過率 49%）好。在閱讀的深度上，我們大多數的孩子仍停留在字面或文章表面的層次。

從評比結果中分析全面關係問卷中還瞭解基本人口資料以及學生閱讀態度、家長閱讀環境、教師閱讀教學、學校閱讀教學政策、整體閱讀課程安排等閱讀條件、環境與學生閱讀成就之間的關係。依據此詳細調查，參與評比的國家便可以看到各國閱讀教育的面貌，藉此改善或調整國內的閱讀教育政策、教學與課程的改革。當一個國家連續參與PIRLS評比，累積多年的評量資料，也可瞭解因教學或政策改變，使得學生閱讀能力有所改變的趨勢。

我國：教育部提升國民素養專案辦公室為了解十二年國教培育出來的學生其素養如何？五大素養(語文、數學、科學、數位，教養與美感素養)
教師專業素養以綠釉皮囊壺象徵素養，為陶塑歷程後帶得走的能力

二、期刊研究報告： Assessment literacy and student learning: the case for explicitly developing students ‘assessment literacy’

前言 How to enable students to feel part of their programmes ’ academic culture while encouraging them to take responsibility for their own learning. (Nicol 2009) Nicol(2009)曾指出對於大一學生而言一個重要的學術議題，是如何增進學生感受部分學術計畫的學術文化以增進他們對自我學習的責任。

To become self-regulated learners, students need to be able to judge their work, identify its merits, locate its weaknesses and determine ways to improve it. 為了成為自我調節的學習者，學生必須能夠評判自己的工作、認同自己的優勢，定位自己的弱勢並決定方法以改善它。 *此處的Judgement包括了評估自己在評量任務的回應是否適當，是否做到被要求要做的。

It also requires them to judge how good their response is in relation to the relevant academic achievement standards (Sadler 2009). 評斷自己的回應與學術成就標準的關聯。 Students ’ understanding of the purposes of assessment and the processes surrounding assessment is part of the context within which they learn to make those judgements and become effectively self-regulating. 學生理解評量的目的與歷程，都是在學習如何評價，與有效的自我調整。

Francis (2008) argues, however, that ﬁrst-year students in particular are likely to over-rate their understanding of the assessment process and that there is a disjuncture between what they think they are being assessed on and what the marking criteria and achievement standards require of them. 問題是，第一年的學生容易高估對評量我成的理解，他們認為自己被評量時的表現與真實成就標準間是有差距的。

By helping to clarify the meaning of learning goals and criteria, and through the provision of feedback, formative assessment encourages students to keep realigning their work to what is required. Nicol consequently applied a framework based on task structure, learner-regulation and an associated set of assessment principles to inform the redesign of formative assessment in two ﬁrst-year courses. Nicol(2009)幫助澄清學生對於學習目標和標準的意義，提供回饋，形成性的評量激勵學生持續調整他們的學習。

他使用了Gibbs and Simpson ’ s (2004) 11 assessment conditions和Nicol and Macfarlane- Dick ’ s (2006) 7 principles of good feedback practice. For students to have a sense of control over their own learning, formative assessment practices must help them develop the skills needed to monitor, judge and manage their learning (Nicol 2009, 338).

什麼是「評量素養」？ The literature here tends to suggest tha students ’ capacity to become successful self-regulated learners can be affected by various aspects of the assessment process. 文獻指出，透過評量的許多歷程，學生成為成功的自我調節學習者。

1.Students need to understand the purpose of assessment and how it connects with their learning trajectory. 然而，學生須了解評量的目的，評量與學習軌跡的連結。

2.They need to be aware of the processes of assessment and how they might affect students ’ capacity to submit responses that are on-task, on-time and completed with appropriate academic integrity. 他們必須意識到評量的過程，評量如何影響學生的能力，擁有學術誠信、對任務按時完成。

3.Opportunities for them to practise judging their own responses to assessment tasks need to be provided so that students can learn to identify what is good about their work and what could be improved. 同時須提供機會，學生練習對自己評量任務的回應的判斷，如此他們才能學習辨認自己工作時的優點和尚待改進的地方。

We therefore conceptualised students ’ capacity to develop these aspects of assessment as assessment literacy, and deﬁned this as students ’ understanding of the rules surrounding assessment in their course context, their use of assessment tasks to monitor or further their learning, and their ability to work with the guidelines on standards in their context to produce work of a predictable standard. 因此，本篇文章將「發展評量這些方面的能力」，概念化為「評量素養」。同時定義「評量素養」為學生對課程內容評量的原則的理解，他們使用此評量任務以監控和促進他們的學習。他們有能力依據評量標準的指引，而有可預測標準的工作行為。

過去的相關研究？過去有少數研究評量經驗的問卷，卻沒有關注評量素養概念的問卷，本研究經過文獻回顧，提供二者差距的一些資訊。
O ’ Donovan, Price and Rust(2004)和O ’ Donovan(2003)促進學生理解評量的研究，接近評量素養的概念。

O ’ Donovan, Price and Rust (2004)發現只依靠對評量標準明確的敘述無法幫助學生理解評量者的感受與評量的目標期許。
The authors have been developing a growing body of work on criterion-based assessment methods including the development and use of assessment rubrics (grids), grade descriptors and benchmark statements. 研究者發展與使用評量欄格、等級說明與指標敘述，以增進學生的評量能力。學生透過訊息接收精確的抓取任務的意義和教師所表達的期待。

Knowledge has both explicit and tacit dimensions, and that learners need to construct that knowledge from experience for themselves for it to have meaning for them. 知識有其顯性和隱性的向度，學習者需要透過經驗建構知識，並從中獲得意義。對於評量標準學科知識與有意義的知識而言皆如此。

Their approach was to aim to develop, through structured activities, students ’ knowledge of how assessment responses would be marked, and in turn their understanding of how their own responses would be judged. 透過結構性的活動，學生將發展對於評量回應如何標記的知識，以及回應如何被判斷的理解。

The intervention was based on the notion that that once the students started making judgements about the quality of the work in front of them they could apply that evaluative way of thinking to their own work to help them self-monitor it during its production and identify ways to improve its quality. 學生開始記錄關於工作品質的評價，在他們能應用評價的方法思考工作之前，幫助他們在產出過程間自我監控，並思考改善品質的方法。

A 90-minute marking workshop 關於標記/紀錄的工作坊
They were given two exemplar pieces of work that they had to mark and provide feedback for. The assignments were similar in nature and format to the next piece of assessment that the participating students were about to commence for their own coursework, but covered different topics with different instructions. 工作坊事前給予兩個任務範本(已有記錄與回饋)，是與即將面對的任務在形式本質上相似的，學生即將要在課堂工作時間評論，但有不同的主題與指示。

During the workshop students discussed their marking and rationales in small groups before reporting to the whole class the marks they awarded and their justiﬁcations. 跟全班發表他們授予的標記與辯解之前，學生於小組間討論或更改他們的標記/記錄與理由。

At this point, the lecturer lead a discussion of the students ’ rationales and related them to the application of the marking criteria. 同時老師會對小組提出的理由進行討論，並連結至標準(指標)的應用。

The small student groups then had a chance to reconsider their marks and rationale, and ﬁnally, the lecturer provided the whole class his/her annotated assignment exemplars showing the feedback, mark and rationale. 小組此時可以修改之前的討論，最後由老師提供他註記的任務範本並秀出回饋、標記與理由。

Students who participated in the workshop showed signiﬁcant improvement in subsequent assessment pieces compared with students who did not participate. 經過三年重複的研究，作者發現參與工作坊的學生(比起未參與者)在隨後的評量有明顯的進步。

Their research show that relying only on the explicit expression of assessment criteria, standards, and processes as a method of transferring knowledge about assessment does not work. 僅依靠對評量標準、評量過程提供明確的表達，以作為轉化知識的方法，是沒有作用的。

The provision of explicit criteria and summarised standards descriptors needs to be complemented by opportunities for students and staff to share the experience of judging the quality of responses in order to build tacit knowledge into the students ’ repertoire, improve their assessment literacy, and hence, their assessment outcomes. 提供明確的標準、與總結性標準的描述，需要給予學生和教師機會分享判斷回應品質的機會，最為一種補充，為學生建立他們的內隱知識，增進評量素養與評量結果。

本研究的目的 In the present study, we aimed to test this assertion by quantifying the impact of developing students ’ assessment literacy on their assessment literacy levels and on their learning outcomes. 本研究的目的在依據評量素養的層級與學習結果，量化學生發展評量素養的影響。

We set a stringent high-risk testing scenario in which the assessment literacy-developing intervention was much briefer (50 minutes) than those used in the O ’ Donovan, Price and Rust (2004; Rust, Price, and O ’ Donovan 2003) studies. 跟O ’ Donovan, Price and Rus的研究相比，本研究使用較嚴謹的高風險的測試方案，而評量素養發展的介入縮短為50分鐘。

We ﬁrst developed a questionnaire to operationalise some key concepts in the assessment literacy arena; we then implemented these measures in a pre- and post-test framework, such that, between the measurement episodes, the students in the experimental cohort were exposed to an assessment literacy-building intervention. 首先建立一份評量素養關鍵概念的問卷，以此問卷作為前後側的設計。在前後測的中間，實驗組的學生進行評量素養建立的實驗介入。

A control group in the same programme of study, but at a different campus location, received only the pre-test instrument and no intervention. This paper reports on the results of this intervention. 在另一個校園內進行控制組實驗，控制組只進行前測，沒有實驗干預。因此本研究得以報告實驗介入的成效。

Method Participants 研究對象：澳大利昆士蘭州369名樣本，分別來自兩所大學(Campus A and Campus B)，皆是修習商業的自願學生(56% females, 44% males, with a mean age of 19.1 years)。最後有效樣本數為349例提供後續的研究分析 (剔除未寫號碼與未完成前後測者)

Materials Assessment literacy
使用本研究作者發展的Assessment Literacy Survey作為測量，共有30個items(5等量表： ﬁ ve-point Likert scale, ranging from 1 =‘ strongly disagree ’ to 5 = ‘ strongly agree ’ ) 包括：

1.Students ’ understanding of the local protocols and performance standards (6 items) (e.g. ‘ I understand the criteria against which my work will be assessed ’ ); 學生對實驗規定與表現標準的理解(6題)

2. Students ’ use of assessment tasks for enhancing or monitoring their learning, including assessment for learning (6 items) (e.g. ‘ I use assessment to ﬁgure out what is important to learn ’ ) 學生使用評量任務以增進或監控學習(6題) and assessment for grading (4 items) (e.g. ‘ I think the University makes me do assessment to: produce work that can be judged for the University ’ s marking and grading purposes ’ ). 學生使用評量任務以獲得分級(4題)

3. Students ’ orientation to putting into the production of assessable work the
minimum amount of effort necessary merely to pass the course requirements (6 items) (e.g. ‘ My aim is to pass the course with as little work aspossible ’ ); 學生投入評量工作的定向是聽過課程需求的最小需求努力(6題)

4. Students ’ ability to judge their own and others ’ responses to assessment task(8 items) (e.g. ‘ I feel conﬁdent that I could judge my peer ’ s work accurately using my knowledge of the criteria and achievement standards provided ’ ). 學生評估自己或他人對於評量任務回應的能力 (8題)

Assessment performance
評量包括：測驗、一個報告、最終考試單選題測驗作為實驗介入前的學術能力(pre- intervention academic ability) 主要關注的自變項是學生成就(分數/等級)，與介入的評量任務相關，歷經一個報告的形式。同時有評量的專欄網格表現敘述(each criterion for each performance standard.) 最終考試的目的是測驗參與者在課堂上、個別指導課程和課本中的關鍵知識。包括單選題、寫作測驗和個案情境簡答題。

Procedure 該研究是準實驗計(quasi-experimental)：實驗組(校園A)進行了前後測和增進評量素養的干預實驗，
控制組(校園B)只進行前測。

一、Assessment rubric phase 參與者被告知當完成任務時如何使用評量規準解釋評量標準及內容
評量者的評價是立基於作答回應與評量標準的對應評判會依照指示中簡短敘述的學術標準而調整

二、Pre-test assessment literacy survey phase
The pre-test measure allowed comparison between the two campuses to detect any initial group differences in assessment literacy and establish baseline levels. 前測提供兩組間的在評量素養的初始比較，並建立基線水平，並針對前測資料進行初始因素分析。

三、Intervention phase (Campus A only)：約45分鐘的實驗階段
1. 「思考、分組、分享」練習( think, pair, and share exercise) 受試者將考慮、判斷與練習標記兩則例子(學生工作的真實例子)已決定他們的品質：優秀、良好、滿意、差的。目的在於幫助受試者學會評斷一篇作品，識別他們用以評判的標準，使用標準以認定不同的成就標準。

(i) Participants made practice judgments on the exemplars ( ‘ think ’ ), then explained and justi ﬁ ed their judgments to the person next to them ( ‘ pair and share ’ ). (ii) Randomly selected pairs shared their decisions with the whole class. (iii) Out of this conversation emerged a list of criteria expressed by the participants in their own language.

2.面對範本佳作的差異，需經過說明範本標記，和舉手表決他們的判斷落在標準的什麼範圍內。最後由課程召集人說明他選擇的標記和原因。
3.參與者指出評量規準，說明他們的評價相對應的學術標準。

四、Post-test assessment literacy survey phase
實驗組再度進行評量素養調查的後測，以評估評量素養的層級(assessment literacy levels) 的改變。

五、Assessment outcome phase
三周後，參與者完成了文獻分析與1500字的報告(形成課程評估的一部分)，同時使用了先前一樣的規準。兩個學校的主題導師將依據標準參照進行報告的評分，參與者的報告的標記將做為自變相與結果變項用做後續的統計分析。

Results Preliminary analyses
Factor analysis was undertaken to con ﬁrm the underlying structure of the survey items. The four factors represented the following constructs, respectively: Assessment Literacy (Understanding) (AU); Assessment for Learning (AL); Minimum Effort Orientation (MEO); and Assessment Literacy (Judgment) (AJ). The factor loading matrix for this ﬁnal solution is presented in Table 1 (pre-test and post-test responses). As evidenced by the following table, the four-factor model was found to be reliable across both the pre and post-tests.

As anticipated, the MEO scale scores were negatively correlated with AL (r = .26, p < 001), AU (r = .30, p < 001) and AJ (r = .27, p < 001). The assessment for learning scale (AL) was positively correlated with AU (r = .42, p < 001) and AJ (r = .33, p < 001). The other two assessment literacy scales, understanding (AU) and judgement (AJ) were also positively correlated (r = .50, p < 001). This is evidence of appropriate and predicted patterns of convergent and discriminant validity (Campbell and Fiske 1959).

Assessment of group differences (pre-test)
Campus A participants scored signiﬁcantly lower assessment for learning scores (AL T1 M = 3.71) than Campus B participants (M = 3.92, t(334) = 3.02, p = .003, d = .33). These results point towards a general trend of slightly poorer baseline motivation and assessment literacy levels for Campus A (intervention) students, especially in terms of using assessment for learning. Consideration must be paid to these baseline differences in any later comparison of average report marks between campus groups, in addition to determining the intervention ’ s impact on assessment literacy levels and any associated improvement in report mark for Campus A. All means, standard deviations and effect sizes are shown in Table 2.

Intervention impact upon assessment literacy
Paired sample t tests were conducted to investigate the impact of the intervention on participants ’ effort, use of assessment and assessment literacy levels. The intervention was effective in producing positive and signiﬁcant change across the three assessment literacy factors: AU t(153) = , p = .00, d = .73, AJ t(157) = 6.51, p = .00, d = .52, and AL t(155) = 6.03, p = .00, d = .39. These effectsizes reveal the intervention resulted in medium to large changes in the three assessment literacy levels (see Table 3).

Considering the brevity of the intervention, themagnitude of this impact may therefore be considered a hefty return on a smallinvestment. No signiﬁcant change occurred in the attitudinal measure of MEOt(157)= 1.67, p = .10, d = .08. All means, standard deviations and effect sizes are shownin Table 3 as follows.

Impact of assessment literacy on assessment results
Correlational analysis and a regression model were developed to examine Campus A (intervention) students ’ report mark as a function of changes in their motivation or assessment literacy levels due to the intervention.

No signiﬁcant relationship was found between changes in MEO and report mark. This is most likely due to the intervention not leading to any signiﬁcant change in MEO levels (refer to Table 3). Table 4 describes outcomes from the correlational analysis.

As the change in MEO for Campus A was not signiﬁcantly correlated with report mark, this variable was not entered into the multiple regression model which comprised the change in scores for: AL Ch , AU Ch and AJ Ch , along with the report mark.

Thus, of the three assessment literacy factors, improving students ’ ability to judge the standards of their own and others ’ work appears to be the most critical to enhanced learning outcomes. Unstandardised(B), and standardised ( b ) regression coefﬁcients, and squared part correlations for each predictor in the model are shown in Table 5.

Unique contribution of improved assessment literacy
To determine whether the improved assessment literacy levels predicted report marks over and above participants ’ post-test motivation (MEO T2 ) and pre-existing academic ability (pre- intervention quiz), a hierarchical multiple regression analysis was conducted for the intervention group.

This finding is practically significant given the brevity of the intervention; it seems that from an intervention of just 50 minutes duration, designed to develop students ’ judgement abilities, their marks on a related task can be significantly increased. A summary of the hierarchical regression for predicting report mark is shown in Table 6.

Between group differences in report mark as a function of assessment literacy
for the pre- and post-test data. For both groups ’ pre-test data, the three assessment literacy factors (AU T1 , AJ T1 , AL T1 ) did not signiﬁcantly correlate with report mark. As hypothesised, these improved assessment literacy factors of AU T2 , AJ T2 , AL T2 as measured at post-test, were more strongly (and signi ﬁ cantly) related to report mark, such that higher assessment literacy levels were related to higher report marks.

While the intervention did not signiﬁcantly alter Campus A levels of MEO, this attitudinal factor as measured at both time one and time two, did signiﬁcantly relate to report mark, with a higher propensity towards using minimal effort related to lower report marks. Table 7 presents these relationships.

Discussion and conclusions
This study theorised the notion of assessment literacy as multi-dimensional, and has shown how the dimensions of assessment literacy differentially contribute to the educational gains derived from this pedagogical intervention. Speciﬁcally, after controlling for prior academic ability and motivational attitude, one dimension of assessment literacy stands out as the ‘ high-leverage ’ dimension – the ability to judge actual works against criteria and standards. 本研究將評量素養視為一種多面向的，此多面向的評量素養透過教學介入有助於教學成效。特別的是，在控制前學術能力與動機態度後，有一個面向的評量素養─「高槓桿」向度─ 一種對應評量標準以判斷真實作品的能力。

The importance of this ﬁnding is that it was the nature of the intervention (i.e. getting students to look at and judge actual examples of student work) that created the gains in this dimension of assessment literacy. This implies that interventions aimed at garnering enhanced learning from assessment, should target the development of assessment literacy. 這樣的教學介入使得評量素養此面向的增進。從評量增進學習，將評量素養的發展視為目標。

This in turn means creating an emphasis on a meta- dialogue about assessment, its purposes and how it functions. A further implication is that gains typically attributable to formative feedback could be enhanced not by a more detailed explication of the feedback by lecturers but rather by deploying assessment literacy (judgement)-enhancing protocols at the formative feedback points during the semester. 同時也強調了學生的對於評量的「後設對話」，評量的目的？評量的運作方式？更進一步可以歸因，形成的回饋並非由講者的詳細解釋，而是開展了評量素養(評斷)的能力。

These ﬁndings support the view that helping students to develop their ability to judge their own and others ’ work will likely enhance their learning outcomes. Interventions which give student practice in judging work against standards, develops the judgement dimension of assessment literacy, which in turn allows them to perform better themselves on similar tasks. 本研究支持了幫助學生發展他們的能力，依據評量標準去判斷自己和同儕的作品表現，更增進學習結果，未來也會在相似的任務上表現較佳。

(比起O ’ Donovan, Price and Rust 2004的研究，聚焦在評量素養，而能縮短介入的時間。)
除了評量素養，本研究提出一個新的思維：「教學投資的回報」。尤其在工作繁重、資源減少行政支持、學生更多元的高等教育，因此可以改變教學實踐，教師們可以期待學生們什麼樣的改變。未來是否值得將這樣的教學，放入正規的教學實踐中？擴展到其他的課程裡？尤其學生的能力可能不僅是直線成長，而是曲線成長。甚至持續到學生晚期的程度。

Student comments such as‘. now I understand what it ’ s all about
Student comments such as‘. . . now I understand what it ’ s all about . . .’ and ‘ I think I ’ m starting to get this . . .’ and ‘ Now I know what ’ s expected of me ’ indicated that students had learned from the experience. Comments from teaching staff included ‘ What a fabulous activity . . .’ and ‘. . . really useful ’ indicated that not only students but also teaching staff perceived beneﬁts in the intervention.

參考文獻 PISA歷年評量周期，2015年10月 9日取自臺灣PISA國家研究中心 PIRLS閱讀素養全面評量，取自台灣PIRLS團隊網站 htm Smith, C. D., Worsfold, K., Davies, L., Fisher, R., & McPhail, R. (2013). Assessment literacy and student learning: The case for explicitly developing students "Assessment Literacy". Assessment & Evaluation in Higher Education, 38(1),

謝謝聆聽！

實作評量：素養評量 ──教學評量專題研究報告

Similar presentations

Presentation on theme: "實作評量：素養評量 ──教學評量專題研究報告"— Presentation transcript:

Similar presentations

About project

反馈

请登录

Auth with social network:

實作評量：素養評量 ──教學評量專題研究報告

Similar presentations

Presentation on theme: "實作評量：素養評量 ──教學評量專題研究報告"— Presentation transcript:

Similar presentations

About project

反馈