實作評量:素養評量 ──教學評量專題研究報告 指導教授:余民寧老師 報 告 人:謝智如 104.10.15
報告大綱 一、素養評量(literacy assessment)相關議 題與應用 二、期刊研究報告: Assessment literacy and student learning: the case for explicitly developing students ‘assessment literacy’
一、素養評量 (literacy assessment) 相關議題與應 用
什麼是素養?(Literacy) Literacy is traditionally understood as the ability to read and write. The term's meaning has been expanded to include the ability to use language, numbers, images and other means to understand and use the dominant symbol systems of a culture. The concept of literacy is expanding in OECD countries to include skills to access knowledge through technology and ability to assess complex contexts.
什麼是素養?(Literacy) The United Nations Educational, Scientific and Cultural Organization (UNESCO) defines literacy as the "ability to identify, understand, interpret, create, communicate and compute, using printed and written materials associated with varying contexts. Literacy involves a continuum of learning in enabling individuals to achieve their goals, to develop their knowledge and potential, and to participate fully in their community and wider society"
素養和能力的不同 素養(literacy)或能力(competence) 所指稱的內容眾說紛紜,但素養包括具 備基本學科知識,將其運用在生活以及 工作的問題解決能力,除了能力的考量 外,也包括運用學科知識解決問題的習 慣與態度,因此,素養包括能力以及態 度兩個層次,更強調整合性的知識,以 及解決真實問題的習慣與態度。
各式素養(many other new literacies) 閱讀素養(reading literacy) 科學素養(scientific literacy) 數學素養(mathematical literacy) 資訊素養、媒體素養(information, media) 環境素養(environmental literacy) 健康素養(health literacy) 價值素養(value literacy) 多元文化素養(multicultural literacy)
素養評量 依據素養的定義,素養評量應屬實作評量的範 疇。 實作評量:強調實際的表現行為(actual performance),都需要教師根據學生的表現 過程之有效性或最後完成作品的成果品質,分 別或合併地進行判斷(或評分),以決定學生 在這方面學習的成就高低。 如概念應用:應用所學的概念和知識解決日常 生活所遇到的實際問題等
素養評量 IEA:國際數學與科學成就趨勢調查 (Progress in International Reading Literacy Study,簡稱TIMSS) OECD:國際學生能力評量計劃( Programme for International Student Assessment,簡稱 PISA),評量閱讀、數學和科學的素養程度 IEA:促進國際閱讀素養研究(Progress in International Reading Literacy Study,簡稱 PIRLS)
PISA PISA研究三年一次,宗旨是針對十五歲學生,生活知 能的學習成效提供跨國際比較,以及各國教育效能的 分析,由此界定國民素養的內涵。 試題設計著重在情境應用,不設限於課程內容,學生 須理解資訊靈活運用統整、評鑑、省思能力,自行建 構問題情境的答案。評量焦點在於能否使用習得知識 技能,面對真實世界的挑戰,而非僅是精熟學校課程。 評量內容涵蓋閱讀,數學和科學三個領域的素養程度。 革新的素養概念結合終身學習的理念,以成人生活所 需的重要知能為主軸,包含正式與非正式的環境,諸 如正規課程,課外社團,家庭環境,學校氣氛等。
PISA評量學科週期排列 2000年閱讀為主科;科學和數學為輔 2003年數學為主科;閱讀和科學為輔 2006年科學為主科;閱讀和數學為輔 2009年又回到閱讀主科,科學和數學為輔 2012年則是數學為主科,閱讀和科學為輔, 另加測線上問題解決能力 (Problem Solving)
PISA評量學科週期排列
PISA 2006科學素養評量 PISA 2006 的科學素養 透過四個向度(情境、能力、知識、態度)來 反映科學素養(Scientific literacy) 主要在於測驗學生應用科學知識的能力。將物 理、化學、生物及地球科學應用到各個題目當 中,以獲得新的科學知識、解釋科學現象、用 證據解讀科學相關議題及科學與技術的關係。
PISA 2006科學素養評量
PISA 2006科學素養評量
PISA 2006科學素養評量
PISA 2006科學素養評量
態度的評量 態度選項(attitudinal item) 大部份 PISA 2006 科學試題都包含了一項新的態度選 項,用來連結試題與學生對該科學議題的態度。 主要有有二種形式,一是測試學生對於學習科學的興 趣,另一是有關於調查學生對於此項科學的支持度 (同意度)。這樣的項目會以灰色方框呈現,學生只 要依照自己的意思去勾選,這些問題並沒有正確的答 案,也不會計算在學生的測驗總分內。這些問題要求 學生表示對特定議題的一些陳述的同意程度。對於每 項陳述學生應該勾選出一個最能代表自己意見的答案。
態度的評量
態度的評量
態度的評量 PISA 2006 透過二個管道蒐集資料:一透過單獨的態度 問卷(questionnaires)蒐集資料,另外也在學生回答能 力/知識問題的過程中,透過問題的設計蒐集並整合學生 的態度資料。這種情境式的問題使 PISA 2006 可以了解 學生對特定科學活動(specific science tasks)的態度。 不同於過去利用一般性的態度問卷(general questions about attitudes)得到的結果,PISA 2006 透過這種方法 去了解(1)學生的態度是否會因情境的不同而有所改變? (2)學生的態度反應到底是跟哪一題的問題比較相關? 還是跟哪一群的題組(groups of questions)比較相關?
PIRLS 閱讀素養評量 閱讀是教育的核心,閱讀讓我們透過文字與符號掌握 知識,學校裡各學科的知識幾乎都是透過閱讀 來學習。 理論上,小學三年級是閱讀發展的關鍵期,在此之前 是「學會閱讀」(learn to read)的基本能力,包括 識字、有基本的文體概念和理解,三年級後是「透過 閱讀來學習新知」(read to learn)。 因此,PIRLS以小學四年級學生為主要評量對象,檢 視他們是否具備了閱讀基本能力,並朝向透過閱讀吸 收新知的階段繼續發展。
PIRLS 閱讀素養評量 「提升國際閱讀素養研究」(PIRLS)可以瞭解我國的學 生閱讀的能力。 與其他國家學生相較之下,我國學生的閱讀能力如何? 閱讀成績有沒有進步? 我國的四年級學生是否看重閱讀?是否能享受閱讀? 我們的學生是否具有能促成讀寫發展的家庭? 我國的學校如何規劃閱讀教學? 我國教師的教學實務與其他國家相較之下如何?
PIRLS對「閱讀素養」的定義 1.學生能夠理解並運用書寫語言的能力 2.能夠從各式各樣的文章中建構出意義來 3.能夠從閱讀中學習新的事物 4.參與學校及生活中閱讀社群的活動 5.經由閱讀獲得樂趣
閱讀素養評量的內容 PIRLS研究工具: 學童所需要閱讀每篇1,200個 字至1,600個字之間的文章和 回答的題目 針對學生、家長、學校、教師 以及課程設計了五種問卷,以 便周全瞭解影響閱讀的環境因 素。
閱讀素養評量的內容 1.測量學生理解說明文和故事體的能力;理解的 評量包括對文章內容與形式的理解。形式問題例 如:「有編號的框框怎樣幫助讀者了解文章的內 容?寫出其中一個方法。 2.學生的閱讀行為和興趣。 3.家長為學生所提供的閱讀環境,例如:親子共 讀頻率和家中圖書等。 4.學校和老師為閱讀所設計的教學和圖書館規劃 等。
PIRLS之閱讀理解層次 歷程 內容 說明 策略 直接歷程 直接提取 讀者找出文章裡清楚寫出的訊息 直接提取 讀者找出文章裡清楚寫出的訊息 1.找出與閱讀目標有關的訊息 2.找出特定觀點 3.搜尋字詞或句子的定義 4.指出故事的場景(例如時間、地點) 5.當文章明顯陳述出來時,找到主題句或主旨 直接推論 連結文章裡兩項以上的訊息 1.歸論出某事件所導致的另一事件 2.在一串的論點後,歸納出重點 3.找出代名詞與主詞的關係 4.歸納文章的主旨 5.描述人物間的關係 解釋歷程 詮釋與統整 讀者提取先備知識,連結文章裡未清楚明顯表達的訊息 1.清楚分辨出文章整體訊息或主題 2.考慮文中人物可選擇的其他行動 3.比較及對照文章訊息 4.推測故事中的情緒或氣氛 5.詮釋文中訊息在真實世界的適用性 檢驗與評估 批判與思考文章中的訊息 1.評估文章所描述的事件實際發生的可能性 2.描述作者如何想出讓人出乎意料的結局 3.評斷文章中訊息的完整性 4.找出作者的觀點
PIRLS評量範例─一個不可思議的夜晚 學生:閱讀文章、回答問題
PIRLS評分範例─一個不可思議的夜晚 評分:依據事先擬定的表現水準描述進行給分
PIRLS評分範例─一個不可思議的夜晚 評分:依據事先擬定的表現水準描述進行給分
PIRLS 2006台灣學生表現 2006年46國學生參加評比,全球平均分數 500分,台灣學生平均分數為535 分,名列22。 學生在直接歷程的得分(541,通過率73%) 明顯地比在解釋歷程的表現(530,通過率 49%)好。在閱讀的深度上,我們大多數的 孩子仍停留在字面或文章表面的層次。
從評比結果中分析全面關係 問卷中還瞭解基本人口資料以及學生閱讀態度、家長 閱讀環 境、教師閱讀教學、學校閱讀教學政策、整體 閱讀課程安排等閱讀條件、環境與學生閱讀成就之間 的關係。 依據此詳細調查,參與評比的國家便可以看到各國閱 讀教育的面貌,藉此改善或調整國內的閱讀教育政策、 教學與課程的改革。 當一個國家連續參與PIRLS評比,累積多年的評量資 料,也可瞭解因教學或政策改變, 使得學生閱讀能力 有所改變的趨勢。
我國: 教育部提升國民素養專案辦公室 為了解十二年國教培育出來的學生其素養如何? 五大素養(語文、數學、科學、數位,教養與美感素養) 教師專業素養 http://literacytw.naer.edu.tw/five.php 以綠釉皮囊壺象徵素養,為陶塑歷程後帶得走的能力
二、 期刊研究報告: Assessment literacy and student learning: the case for explicitly developing students ‘assessment literacy’
前言 How to enable students to feel part of their programmes ’ academic culture while encouraging them to take responsibility for their own learning. (Nicol 2009) Nicol(2009)曾指出對於大一學生而言一個重 要的學術議題,是如何增進學生感受部分學術 計畫的學術文化以增進他們對自我學習的責任。
To become self-regulated learners, students need to be able to judge their work, identify its merits, locate its weaknesses and determine ways to improve it. 為了成為自我調節的學習者,學生必須能夠評判自己 的工作、認同自己的優勢,定位自己的弱勢並決定方 法以改善它。 *此處的Judgement包括了評估自己在評量任務的回 應是否適當,是否做到被要求要做的。
It also requires them to judge how good their response is in relation to the relevant academic achievement standards (Sadler 2009). 評斷自己的回應與學術成就標準的關聯。 Students ’ understanding of the purposes of assessment and the processes surrounding assessment is part of the context within which they learn to make those judgements and become effectively self-regulating. 學生理解評量的目的與歷程,都是在學習如何評價, 與有效的自我調整。
Francis (2008) argues, however, that first-year students in particular are likely to over-rate their understanding of the assessment process and that there is a disjuncture between what they think they are being assessed on and what the marking criteria and achievement standards require of them. 問題是,第一年的學生容易高估對評量我成的理解, 他們認為自己被評量時的表現與真實成就標準間是有 差距的。
By helping to clarify the meaning of learning goals and criteria, and through the provision of feedback, formative assessment encourages students to keep realigning their work to what is required. Nicol consequently applied a framework based on task structure, learner-regulation and an associated set of assessment principles to inform the redesign of formative assessment in two first-year courses. Nicol(2009)幫助澄清學生對於學習目標和標準的意義, 提供回饋,形成性的評量激勵學生持續調整他們的學 習。
他使用了Gibbs and Simpson ’ s (2004) 11 assessment conditions和Nicol and Macfarlane- Dick ’ s (2006) 7 principles of good feedback practice. For students to have a sense of control over their own learning, formative assessment practices must help them develop the skills needed to monitor, judge and manage their learning (Nicol 2009, 338).
什麼是「評量素養」? The literature here tends to suggest tha students ’ capacity to become successful self-regulated learners can be affected by various aspects of the assessment process. 文獻指出,透過評量的許多歷程,學生成為成 功的自我調節學習者。
1.Students need to understand the purpose of assessment and how it connects with their learning trajectory. 然而,學生須了解評量的目的,評量與學習軌 跡的連結。
2.They need to be aware of the processes of assessment and how they might affect students ’ capacity to submit responses that are on-task, on-time and completed with appropriate academic integrity. 他們必須意識到評量的過程,評量如何影響學 生的能力,擁有學術誠信、對任務按時完成。
3.Opportunities for them to practise judging their own responses to assessment tasks need to be provided so that students can learn to identify what is good about their work and what could be improved. 同時須提供機會,學生練習對自己評量任務的 回應的判斷,如此他們才能學習辨認自己工作 時的優點和尚待改進的地方。
We therefore conceptualised students ’ capacity to develop these aspects of assessment as assessment literacy, and defined this as students ’ understanding of the rules surrounding assessment in their course context, their use of assessment tasks to monitor or further their learning, and their ability to work with the guidelines on standards in their context to produce work of a predictable standard. 因此,本篇文章將「發展評量這些方面的能力」,概念化 為「評量素養」。同時定義「評量素養」為學生對課程內 容評量的原則的理解,他們使用此評量任務以監控和促進 他們的學習。他們有能力依據評量標準的指引,而有可預 測標準的工作行為。
過去的相關研究? 過去有少數研究評量經驗的問卷,卻沒有關注評量素 養概念的問卷,本研究經過文獻回顧,提供二者差距 的一些資訊。 O ’ Donovan, Price and Rust(2004)和O ’ Donovan(2003)促進學生理解評量的研究,接近評量 素養的概念。
O ’ Donovan, Price and Rust (2004)發現只依靠對評 量標準明確的敘述無法幫助學生理解評量者的感受與 評量的目標期許。 The authors have been developing a growing body of work on criterion-based assessment methods including the development and use of assessment rubrics (grids), grade descriptors and benchmark statements. 研究者發展與使用評量欄格、等級說明與指標敘述, 以增進學生的評量能力。學生透過訊息接收精確的抓 取任務的意義和教師所表達的期待。
Knowledge has both explicit and tacit dimensions, and that learners need to construct that knowledge from experience for themselves for it to have meaning for them. 知識有其顯性和隱性的向度,學習者需要透過經驗建 構知識,並從中獲得意義。對於評量標準學科知識與 有意義的知識而言皆如此。
Their approach was to aim to develop, through structured activities, students ’ knowledge of how assessment responses would be marked, and in turn their understanding of how their own responses would be judged. 透過結構性的活動,學生將發展對於評量回應如何標 記的知識,以及回應如何被判斷的理解。
The intervention was based on the notion that that once the students started making judgements about the quality of the work in front of them they could apply that evaluative way of thinking to their own work to help them self-monitor it during its production and identify ways to improve its quality. 學生開始記錄關於工作品質的評價,在他們能應用評 價的方法思考工作之前,幫助他們在產出過程間自我 監控,並思考改善品質的方法。
A 90-minute marking workshop 關於標記/紀錄的工作坊 They were given two exemplar pieces of work that they had to mark and provide feedback for. The assignments were similar in nature and format to the next piece of assessment that the participating students were about to commence for their own coursework, but covered different topics with different instructions. 工作坊事前給予兩個任務範本(已有記錄與回饋),是 與即將面對的任務在形式本質上相似的,學生即將要 在課堂工作時間評論,但有不同的主題與指示。
During the workshop students discussed their marking and rationales in small groups before reporting to the whole class the marks they awarded and their justifications. 跟全班發表他們授予的標記與辯解之前,學生 於小組間討論或更改他們的標記/記錄與理由。
At this point, the lecturer lead a discussion of the students ’ rationales and related them to the application of the marking criteria. 同時老師會對小組提出的理由進行討論,並連 結至標準(指標)的應用。
The small student groups then had a chance to reconsider their marks and rationale, and finally, the lecturer provided the whole class his/her annotated assignment exemplars showing the feedback, mark and rationale. 小組此時可以修改之前的討論,最後由老師提 供他註記的任務範本並秀出回饋、標記與理由。
Students who participated in the workshop showed significant improvement in subsequent assessment pieces compared with students who did not participate. 經過三年重複的研究,作者發現參與工作坊的 學生(比起未參與者)在隨後的評量有明顯的進 步。
Their research show that relying only on the explicit expression of assessment criteria, standards, and processes as a method of transferring knowledge about assessment does not work. 僅依靠對評量標準、評量過程提供明確的表達, 以作為轉化知識的方法,是沒有作用的。
The provision of explicit criteria and summarised standards descriptors needs to be complemented by opportunities for students and staff to share the experience of judging the quality of responses in order to build tacit knowledge into the students ’ repertoire, improve their assessment literacy, and hence, their assessment outcomes. 提供明確的標準、與總結性標準的描述,需要給予學 生和教師機會分享判斷回應品質的機會,最為一種補 充,為學生建立他們的內隱知識,增進評量素養與評 量結果。
本研究的目的 In the present study, we aimed to test this assertion by quantifying the impact of developing students ’ assessment literacy on their assessment literacy levels and on their learning outcomes. 本研究的目的在依據評量素養的層級與學習結 果,量化學生發展評量素養的影響。
We set a stringent high-risk testing scenario in which the assessment literacy-developing intervention was much briefer (50 minutes) than those used in the O ’ Donovan, Price and Rust (2004; Rust, Price, and O ’ Donovan 2003) studies. 跟O ’ Donovan, Price and Rus的研究相比,本研究使 用較嚴謹的高風險的測試方案,而評量素養發展的介 入縮短為50分鐘。
We first developed a questionnaire to operationalise some key concepts in the assessment literacy arena; we then implemented these measures in a pre- and post-test framework, such that, between the measurement episodes, the students in the experimental cohort were exposed to an assessment literacy-building intervention. 首先建立一份評量素養關鍵概念的問卷,以此問卷作 為前後側的設計。在前後測的中間,實驗組的學生進 行評量素養建立的實驗介入。
A control group in the same programme of study, but at a different campus location, received only the pre-test instrument and no intervention. This paper reports on the results of this intervention. 在另一個校園內進行控制組實驗,控制組只進 行前測,沒有實驗干預。因此本研究得以報告 實驗介入的成效。
Method Participants 研究對象:澳大利昆士蘭州369名樣本,分別 來自兩所大學(Campus A and Campus B), 皆是修習商業的自願學生(56% females, 44% males, with a mean age of 19.1 years)。 最後有效樣本數為349例提供後續的研究分析 (剔除未寫號碼與未完成前後測者)
Materials Assessment literacy 使用本研究作者發展的Assessment Literacy Survey作為測量,共有30個items(5等量表: fi ve-point Likert scale, ranging from 1 =‘ strongly disagree ’ to 5 = ‘ strongly agree ’ ) 包括:
1.Students ’ understanding of the local protocols and performance standards (6 items) (e.g. ‘ I understand the criteria against which my work will be assessed ’ ); 學生對實驗規定與表現標準的理解(6題)
2. Students ’ use of assessment tasks for enhancing or monitoring their learning, including assessment for learning (6 items) (e.g. ‘ I use assessment to figure out what is important to learn ’ ) 學生使用評量任務以增進或監控學習(6題) and assessment for grading (4 items) (e.g. ‘ I think the University makes me do assessment to: produce work that can be judged for the University ’ s marking and grading purposes ’ ). 學生使用評量任務以獲得分級(4題)
3. Students ’ orientation to putting into the production of assessable work the minimum amount of effort necessary merely to pass the course requirements (6 items) (e.g. ‘ My aim is to pass the course with as little work aspossible ’ ); 學生投入評量工作的定向是聽過課程需求的最 小需求努力(6題)
4. Students ’ ability to judge their own and others ’ responses to assessment task(8 items) (e.g. ‘ I feel confident that I could judge my peer ’ s work accurately using my knowledge of the criteria and achievement standards provided ’ ). 學生評估自己或他人對於評量任務回應的能力 (8題)
Assessment performance 評量包括:測驗、一個報告、最終考試 單選題測驗作為實驗介入前的學術能力(pre- intervention academic ability) 主要關注的自變項是學生成就(分數/等級),與介入的 評量任務相關,歷經一個報告的形式。同時有評量的 專欄網格表現敘述(each criterion for each performance standard.) 最終考試的目的是測驗參與者在課堂上、個別指導課 程和課本中的關鍵知識。包括單選題、寫作測驗和個 案情境簡答題。
Procedure 該研究是準實驗計(quasi-experimental): 實驗組(校園A)進行了前後測和增進評量素養 的干預實驗, 控制組(校園B)只進行前測。
一、Assessment rubric phase 參與者被告知當完成任務時如何使用評量規準 解釋評量標準及內容 評量者的評價是立基於作答回應與評量標準的 對應 評判會依照指示中簡短敘述的學術標準而調整
二、Pre-test assessment literacy survey phase The pre-test measure allowed comparison between the two campuses to detect any initial group differences in assessment literacy and establish baseline levels. 前測提供兩組間的在評量素養的初始比較,並建立基 線水平,並針對前測資料進行初始因素分析。
三、Intervention phase (Campus A only): 約45分鐘的實驗階段 1. 「思考、分組、分享」練習( think, pair, and share exercise) 受試者將考慮、判斷與練習標記兩則例子(學 生工作的真實例子)已決定他們的品質:優秀、 良好、滿意、差的。目的在於幫助受試者學會 評斷一篇作品,識別他們用以評判的標準,使 用標準以認定不同的成就標準。
(i) Participants made practice judgments on the exemplars ( ‘ think ’ ), then explained and justi fi ed their judgments to the person next to them ( ‘ pair and share ’ ). (ii) Randomly selected pairs shared their decisions with the whole class. (iii) Out of this conversation emerged a list of criteria expressed by the participants in their own language.
2.面對範本佳作的差異,需經過說明範本標記, 和舉手表決他們的判斷落在標準的什麼範圍內。 最後由課程召集人說明他選擇的標記和原因。 3.參與者指出評量規準,說明他們的評價相對 應的學術標準。
四、Post-test assessment literacy survey phase 實驗組再度進行評量素養調查的後測,以評估 評量素養的層級(assessment literacy levels) 的改變。
五、Assessment outcome phase 三周後,參與者完成了文獻分析與1500字 的報告(形成課程評估的一部分),同時使用了 先前一樣的規準。兩個學校的主題導師將依據 標準參照進行報告的評分,參與者的報告的標 記將做為自變相與結果變項用做後續的統計分 析。
Results Preliminary analyses Factor analysis was undertaken to con firm the underlying structure of the survey items. The four factors represented the following constructs, respectively: Assessment Literacy (Understanding) (AU); Assessment for Learning (AL); Minimum Effort Orientation (MEO); and Assessment Literacy (Judgment) (AJ). The factor loading matrix for this final solution is presented in Table 1 (pre-test and post-test responses). As evidenced by the following table, the four-factor model was found to be reliable across both the pre and post-tests.
As anticipated, the MEO scale scores were negatively correlated with AL (r = .26, p < 001), AU (r = .30, p < 001) and AJ (r = .27, p < 001). The assessment for learning scale (AL) was positively correlated with AU (r = .42, p < 001) and AJ (r = .33, p < 001). The other two assessment literacy scales, understanding (AU) and judgement (AJ) were also positively correlated (r = .50, p < 001). This is evidence of appropriate and predicted patterns of convergent and discriminant validity (Campbell and Fiske 1959).
Assessment of group differences (pre-test) Campus A participants scored significantly lower assessment for learning scores (AL T1 M = 3.71) than Campus B participants (M = 3.92, t(334) = 3.02, p = .003, d = .33). These results point towards a general trend of slightly poorer baseline motivation and assessment literacy levels for Campus A (intervention) students, especially in terms of using assessment for learning. Consideration must be paid to these baseline differences in any later comparison of average report marks between campus groups, in addition to determining the intervention ’ s impact on assessment literacy levels and any associated improvement in report mark for Campus A. All means, standard deviations and effect sizes are shown in Table 2.
Intervention impact upon assessment literacy Paired sample t tests were conducted to investigate the impact of the intervention on participants ’ effort, use of assessment and assessment literacy levels. The intervention was effective in producing positive and significant change across the three assessment literacy factors: AU t(153) = 10.21, p = .00, d = .73, AJ t(157) = 6.51, p = .00, d = .52, and AL t(155) = 6.03, p = .00, d = .39. These effectsizes reveal the intervention resulted in medium to large changes in the three assessment literacy levels (see Table 3).
Considering the brevity of the intervention, themagnitude of this impact may therefore be considered a hefty return on a smallinvestment. No significant change occurred in the attitudinal measure of MEOt(157)= 1.67, p = .10, d = .08. All means, standard deviations and effect sizes are shownin Table 3 as follows.
Impact of assessment literacy on assessment results Correlational analysis and a regression model were developed to examine Campus A (intervention) students ’ report mark as a function of changes in their motivation or assessment literacy levels due to the intervention.
No significant relationship was found between changes in MEO and report mark. This is most likely due to the intervention not leading to any significant change in MEO levels (refer to Table 3). Table 4 describes outcomes from the correlational analysis.
As the change in MEO for Campus A was not significantly correlated with report mark, this variable was not entered into the multiple regression model which comprised the change in scores for: AL Ch , AU Ch and AJ Ch , along with the report mark.
Thus, of the three assessment literacy factors, improving students ’ ability to judge the standards of their own and others ’ work appears to be the most critical to enhanced learning outcomes. Unstandardised(B), and standardised ( b ) regression coefficients, and squared part correlations for each predictor in the model are shown in Table 5.
Unique contribution of improved assessment literacy To determine whether the improved assessment literacy levels predicted report marks over and above participants ’ post-test motivation (MEO T2 ) and pre-existing academic ability (pre- intervention quiz), a hierarchical multiple regression analysis was conducted for the intervention group.
This finding is practically significant given the brevity of the intervention; it seems that from an intervention of just 50 minutes duration, designed to develop students ’ judgement abilities, their marks on a related task can be significantly increased. A summary of the hierarchical regression for predicting report mark is shown in Table 6.
Between group differences in report mark as a function of assessment literacy for the pre- and post-test data. For both groups ’ pre-test data, the three assessment literacy factors (AU T1 , AJ T1 , AL T1 ) did not significantly correlate with report mark. As hypothesised, these improved assessment literacy factors of AU T2 , AJ T2 , AL T2 as measured at post-test, were more strongly (and signi fi cantly) related to report mark, such that higher assessment literacy levels were related to higher report marks.
While the intervention did not significantly alter Campus A levels of MEO, this attitudinal factor as measured at both time one and time two, did significantly relate to report mark, with a higher propensity towards using minimal effort related to lower report marks. Table 7 presents these relationships.
Discussion and conclusions This study theorised the notion of assessment literacy as multi-dimensional, and has shown how the dimensions of assessment literacy differentially contribute to the educational gains derived from this pedagogical intervention. Specifically, after controlling for prior academic ability and motivational attitude, one dimension of assessment literacy stands out as the ‘ high-leverage ’ dimension – the ability to judge actual works against criteria and standards. 本研究將評量素養視為一種多面向的,此多面向的評量素 養透過教學介入有助於教學成效。特別的是,在控制前學 術能力與動機態度後,有一個面向的評量素養─「高槓桿」 向度─ 一種對應評量標準以判斷真實作品的能力。
The importance of this finding is that it was the nature of the intervention (i.e. getting students to look at and judge actual examples of student work) that created the gains in this dimension of assessment literacy. This implies that interventions aimed at garnering enhanced learning from assessment, should target the development of assessment literacy. 這樣的教學介入使得評量素養此面向的增進。從評量 增進學習,將評量素養的發展視為目標。
This in turn means creating an emphasis on a meta- dialogue about assessment, its purposes and how it functions. A further implication is that gains typically attributable to formative feedback could be enhanced not by a more detailed explication of the feedback by lecturers but rather by deploying assessment literacy (judgement)-enhancing protocols at the formative feedback points during the semester. 同時也強調了學生的對於評量的「後設對話」,評量的目 的?評量的運作方式?更進一步可以歸因,形成的回饋並 非由講者的詳細解釋,而是開展了評量素養(評斷)的能力。
These findings support the view that helping students to develop their ability to judge their own and others ’ work will likely enhance their learning outcomes. Interventions which give student practice in judging work against standards, develops the judgement dimension of assessment literacy, which in turn allows them to perform better themselves on similar tasks. 本研究支持了幫助學生發展他們的能力,依據評量標 準去判斷自己和同儕的作品表現,更增進學習結果, 未來也會在相似的任務上表現較佳。
(比起O ’ Donovan, Price and Rust 2004的研究,聚 焦在評量素養,而能縮短介入的時間。) 除了評量素養,本研究提出一個新的思維:「教學投 資的回報」。尤其在工作繁重、資源減少行政支持、 學生更多元的高等教育,因此可以改變教學實踐,教 師們可以期待學生們什麼樣的改變。 未來是否值得將這樣的教學,放入正規的教學實踐中? 擴展到其他的課程裡?尤其學生的能力可能不僅是直 線成長,而是曲線成長。甚至持續到學生晚期的程度。
Student comments such as‘. now I understand what it ’ s all about Student comments such as‘. . . now I understand what it ’ s all about . . .’ and ‘ I think I ’ m starting to get this . . .’ and ‘ Now I know what ’ s expected of me ’ indicated that students had learned from the experience. Comments from teaching staff included ‘ What a fabulous activity . . .’ and ‘. . . really useful ’ indicated that not only students but also teaching staff perceived benefits in the intervention.
參考文獻 PISA歷年評量周期,2015年10月 9日取自臺灣PISA國家研究中 心http://pisa.nutn.edu.tw/pisa_tw_03.htm PIRLS閱讀素養全面評量,取自台灣PIRLS團隊網站 http://lrn.ncu.edu.tw/Teacher%20web/hwawei/PIRLS_home. htm Smith, C. D., Worsfold, K., Davies, L., Fisher, R., & McPhail, R. (2013). Assessment literacy and student learning: The case for explicitly developing students "Assessment Literacy". Assessment & Evaluation in Higher Education, 38(1), 44-60.
謝謝聆聽!