中国科学技术大学计算机系 陈香兰 xlanchen@ustc.edu.cn Fall2013 第十讲 文件管理 中国科学技术大学计算机系 陈香兰 xlanchen@ustc.edu.cn Fall2013
内容提要 文件定义、属性、操作和类型 文件的逻辑结构及其访问模式 文件系统的组织 外存分配算法和空闲空间的管理 目录结构 文件保护 文件系统的层次结构 外存分配算法和空闲空间的管理
内容提要 文件定义、属性、操作和类型 文件的逻辑结构及其访问模式 文件系统的组织 外存分配算法和空闲空间的管理 目录结构 文件保护 文件系统的层次结构 外存分配算法和空闲空间的管理
File-system Files + 逻辑视图 Directory 逻辑结构和组织 文件结构 目录结构 文件系统结构 文件的盘块组织 盘块分配 空闲盘块组织 物理结构和组织
文件的定义 文件系统在物理存储系统中建立了一个统一的逻辑上的视图 文件是最小的逻辑存储单元 A named collection of related information that is recorded on secondary storage. A sequence of bits, bytes, lines, or records whose meaning is defined by creator and user. usually mapped to storage on a nonvolatile physical device by OS 文件内形成一个:Contiguous logical address space
A file may have a certain defined structure according to its type: What is file? (cont.) A file can be Data file (numeric, character, binary); program file; … A file may have a certain defined structure according to its type: .txt; .c, .s, .S, .h, …; .o; .out, .exe, .elf, .com, … How to describe a file Attribute 文件属性 Operation 文件操作 Type 文件类型 Structure文件结构
File Attributes文件属性 Name Type Location Size The only information kept in human-readable form Type needed for systems that support different types Location pointer to file location on device Size current file size
Time(s), date, and user identification Protection Access controls who can read, write, execute the file. Time(s), date, and user identification data for protection, security, and usage monitoring. These information are kept in the directory structure, which is maintained on the disk.
文件操作 Basic file operations – minimal set Create: file space + a directory entry Write: write pointer Read: read pointer Seek: reposition within file Delete: Truncate Requires read and write pointers, or a current-file-position pointer.
Combine these basic operations to perform other operations Append: Seek to the end, and write Copy: Create a new file, read old file and write into the new file other Rename Get/set file attributes Etc.
Opening a file Open(filename1) finds the file and stores a pointer to it in an open file table (OFT). 从目录中搜索filename1对应的目录项; 将目录项中的内容拷贝到内存中; 在OFT中保存指向目录项的指针; 返回在OFT中的序号 Open-file table ID = index to the table I/O operations use the pointer rather than the file name so there is no need to search for the file each time.
Multi-user Variation 多个用户会同时使用同一个文件 Use two levels of internal open-file tables: Per-process table contains some process-dependent file information E.g. Access rights, file pointers System-wide table contains some process-independent file information E.g. location on disk, size, open count, lock(s)
Two level open-file tables system OFT process P1 OFT process P2
Close a file close (filename2) Removes the content of the file from memory and removes its entry in the open-file table
Typical file types
内容提要 文件定义、属性、操作和类型 文件的逻辑结构及其访问模式 文件系统的组织 外存分配算法和空闲空间的管理 目录结构 文件保护 文件系统的层次结构 外存分配算法和空闲空间的管理
文件的逻辑结构 文件的物理结构 文件的逻辑结构 文件的逻辑结构: 有结构文件 无结构,流式文件 有结构,记录式文件。 在逻辑上,文件由若干个记录组成 有结构文件 记录:定长 vs. 不定长 记录的组织:顺序(按某种规则排序)、索引、索引顺序
文件的访问模式 顺序访问:一个记录接着一个记录的访问 直接访问(随机访问) 索引访问 适用于磁带 文件读写指针渐变 根据要访问的记录的序号,计算出要访问的偏移 设置文件读写指针 索引访问 根据索引,查询要访问的偏移
内容提要 文件定义、属性、操作和类型 文件的逻辑结构及其访问模式 文件系统的组织 外存分配算法和空闲空间的管理 目录结构 文件保护 文件系统的层次结构 外存分配算法和空闲空间的管理
文件系统的组织 分区:Partition (mini-disks, volumes),可以是 Files + directories 一个磁盘 磁盘的一个部分 provide separate logical spaces on one disk 多个磁盘 group several disks into a single logical space Files + directories Directory 目录 holds file information (e.g. name, location, size, type) for all files in that partition
目录结构 A collection of nodes containing information about all files. A symbol table Both the directory structure and the files reside on disk. Backups of these two structures are kept on tapes. F 1 F 2 F 3 F 4 F n Directory Files
Information in a Device Directory Name Type Address Current length Maximum length Date last accessed (for archival) Date last updated (for dump) Owner ID (who pays) Protection information (discuss later) DOS FCB (file control block) 32 bytes each May cost many I/O operations to search for an entry UNIX Inode索引结点 Store most of file attributes in inode Directory entry contains file name and a pointer to the inode. 16 bytes
Operations Performed on Directory Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system Search in the table for an entry Insert an entry Delete an entry Modify an entry …
Organize the Directory (Logically) to Obtain Efficiency – locating a file quickly. Naming – convenient to users. Two users can have same name for different files. The same file can have several different names. Grouping – human convention logical grouping of files by properties, (e.g., all Pascal programs, all games, …)
Directory Structures Single-level directory(1级目录) Two-level directory(2级目录) Tree-structured directory(树型) Acyclic-graph directory(无环图)
Single-Level Directory 1级目录 Easy to support and understand. A single directory for all Very low searching speed, O(N)
Problems start when there are large numbers of files and/or users Naming problem Small naming space Name collision
Two-Level Directory 2级目录 Separate directory for each user UFD, User File Directory Each entry owns information for a user’s file MFD, Master file directory Each entry contains: User name, Pointer to his UFD
Can have the same file name for different user Efficient searching Easy management:Add/delete a user Security VS. Sharing MFD, system administrator UFD, isolated from other users How to share? Directory tree (inverted tree) & path name Search path A UFD may be very large, then …
树型目录结构
树型目录结构(Cont.) Root directory & directory & subdirectory Regular file VS. subdirectory Treat a subdirectory like another file use a special bit in the directory entry to distinguish a file (0) from a subdirectory (1) Absolute vs. relative path names? e.g. /spell/words/rade ../spell/words/rade Current directory (working/searching directory)
Acyclic-Graph Directories 引入文件和目录的共享 Two different names (aliasing)
Acyclic-Graph Directories (Cont.) Implementation Symbolic links 符号链接 A special directory entry (link) The content of such file is the path name of the real file/directory How to traverse a directory contains symbolic links Duplicates directory entries Hard to maintain consistency
The problems existed 遍历问题 删除问题 How to ensure there are no cycles Different names, actual only one file 删除问题 Dangling pointer Another implementation: hard link File-reference list reference count How to ensure there are no cycles
Protection Reliability可靠性 Protection 保护 Guarding against physical damage File systems can be damaged by Hardware problems, power surges or failures, head crashed, dirt, temperature extremes, or Vandalism Generally provided by duplicate copies of files (disk tape, …) Protection 保护 Guarding against improper access
Protection in multi-user system Types of access Read Write Execute Append Delete List File owner/creator should be able to control: what can be done, by whom
Access Lists and Groups make access dependent on the ID of the user Access list Associate with each file and directory an access list. Access list specifies for each listed user name and the types of access allowed. Stored in each directory entry Length problem, Solution RWX a) owner access 7 1 1 1 RWX b) groups access 6 1 1 0 c) public access 1 0 0 1
Access lists and groups Ask manager to create a group (unique name), say G, and add some users to the group. For a particular file (say game) or subdirectory, define an appropriate access. Attach a group to a file chgrp G game 感受: 在Linux中,使用 ls -la命令可以看到 owner group public chmod 761 game
File-System Structure 逻辑存储单元 Collection of related information File system resides on secondary storage (disks) 文件系统的组织 How the file system should look to the user How to map the logical file system onto the physical secondary-storage devices File system organized into layers
Layered file system
内容提要 文件定义、属性、操作和类型 文件的逻辑结构及其访问模式 文件系统的组织 磁盘调度算法、外存分配算法和空闲空间的管理 目录结构 文件保护 文件系统的层次结构 磁盘调度算法、外存分配算法和空闲空间的管理
磁盘调度算法 先来先服务 最短寻道时间优先 扫描算法SCAN 循环扫描算法CSCAN C-LOOK 假设请求队列为: (0-199). 98, 183, 37, 122, 14, 124, 65, 67 起初磁头位置在53
先来先服务
最短寻道时间优先
扫描算法SCAN
循环扫描算法CSCAN
C-LOOK
外存分配算法 连续分配 链接分配 索引分配 Reading: 隐式 VS. 显式 单级索引分配、多级索引分配、混合分配方式 计算机操作系统,汤子瀛,二版,9.2节 或三版的6.3节 Operating System Concepts, 7th Edition, section 11.4, P421-
连续分配(contiguous allocation) 为每个文件分配一组连续相邻的盘块 简单。目录只需存储 <starting block # , length (number of blocks) > 逻辑地址物理地址的转换简单 (Logical Address)/512 = Q ……R Block # = Q + starting block; 缺点:存在外部碎片;文件增长受限
链接分配(linked allocation) 离散的分配方式: 将文件装入多个离散的盘块中,盘块与盘块之间使用链接指针链接起来,形成链接文件 两种实现 隐式链接 显示链接
隐式链接 文件目录中包含指向第一个盘块和最后一个盘块的指针 每个盘块中都包含一个指向下一个盘块的指针 缺点: 解决方案1: 访问速度差 只适合顺序访问 可靠性差 解决方案1: 引入簇 缺点:内部碎片增大
显式链接 解决方案2:显式链接 例如FAT系列文件系统 (File Allocation Table) 缺点: 把链接文件的链接指针显式的存放 在一张链接表中。 一个磁盘一个链接表 例如FAT系列文件系统 (File Allocation Table) 缺点: 仍然是顺序的 需要将整个FAT表装入内存
索引分配(indexed allocation) 一、单级索引方式(linked scheme) 为每个文件分配一个索引表,记录分配给这个文件的盘块号 存放在专门的索引块中 目录中存放索引块的指针 可以支持直接访问 缺点: 索引块将占用磁盘空间 索引块利用率低 index table
二、多级索引方式(multi-level index) 考虑1个大文件 1个索引块容纳不下所有的索引记录 引入多级索引 缺点: 索引层次多 索引块利用率低 目录 outer-index index table file
三、混合分配方式(combined scheme) 文件有大有小,需要一种可扩展、适应性较强的组织方式 引入:混合分配方式 直接地址 一次间接地址 多次间接地址 10 entries 例如:UNIX中的索引结点
空闲存储空间的管理 Free space management 算法 To keep track of free disk space Free-space list 算法 空闲表法 用于连续分配方式 空闲链表法 空闲盘块链 空闲盘区链 位图法 使用1个bit表示一个盘块 成组链接法(Reading 汤子瀛,三版 6.5.3)
回顾 文件定义、属性、操作和类型 文件的逻辑结构及其访问模式 文件系统的组织 外存分配算法和空闲空间的管理 目录结构 文件保护 文件系统的层次结构 外存分配算法和空闲空间的管理
作业