专利摘要:
PURPOSE: A large-capacity file storage system and a method of adding and deleting data blocks of dynamic multi-level inode are provided to effectively manage large-capacity files. CONSTITUTION: A large-capacity file storage system includes a pointer having a level different from the level of an inode information area(101A) storing information about root inode. The pointer includes a double indirect inner pointer in which a data block exists through double indirect pointer nodes, a single indirect inner pointer in which a data block exists through a single indirect pointer node, and a direct point that directly points a data block. The pointer of the root inode is allocated according to the direct pointer, single indirect pointer node and double indirect pointer node by increasing and decreasing the level in accordance with the size of the data block.
公开号:KR20020092550A
申请号:KR1020010031213
申请日:2001-06-04
公开日:2002-12-12
发明作者:김경배;김창수;신범주
申请人:한국전자통신연구원;
IPC主号:
专利说明:

Large file storage system and method for adding and deleting data blocks of dynamic multi-level inode using it {Method for deleting and adding of dynamic multi-level inode for Huge File Storage System}
[8] The present invention relates to a large file storage system and a method for adding and deleting data blocks of a dynamic multi-level inode using the same. More particularly, the present invention relates to a pointer to an extension in which an inode collects a data block or a data block. A large file storage system for creating and reducing the number of null pointer nodes and minimizing the number of blocks, and a method for adding and deleting data blocks of a dynamic multi-level inode using the same, and a program for realizing the method. The present invention relates to a computer-readable recording medium.
[9] Research on file storage systems has been ongoing since the computer was first developed and many products have been released. Typical Unix filesystems (UFS), Linux filesystems (Extfs), system filesystems (SFS), Veritas filesystems (VXFS), remote file shares (RFS), network filesystems (NFS) and global file systems at the University of Minnesota (GFS: global file system).
[10] In particular, the inode structure of Unix and Linux systems is the structure used in most computer systems, that is, the fixed inode method of the Unix file system (UFS) and the Linux file system (EXT2FS). The first 10 entries of the inode store the addresses of the data blocks for direct access at the inode, and the 11th entry is a pointer for the indirect access. The 12th entry stores a pointer for dual interference access, and the 13th entry stores a pointer for triple interference access and uses the data block to access the data block.
[11] Here, the inode structure of the global file system is called a dinode and is proposed for storing a large file. The feature of this dinosaur is that all data blocks are of the same level flat file structure. The first time you create a file, all pointers point to blocks of data, but if the data block grows and you can no longer use the direct pointer, you can increment all pointers to interfering pointers.
[12] However, due to its structural characteristics, existing file systems such as Unix and Linux cannot handle large files in a conventional manner. Assuming a Unix system writes 1 KB of data in a block, one block can have 256 block numbers. The capacity of the file that can be saved is as follows.
[13] Direct block access: 1K × 10 = 10 KB
[14] Indirect block access: with 256 direct blocks = 256 KB
[15] Double indirect block access: 256 × 256 = 64 MB
[16] Triple indirect block access: 256 × 256 × 256 = 16 GB
[17] That is, the maximum size of a file that can be created and managed by an existing file system does not exceed about 16.7 GB. Therefore, the GFS of the University of Minnesota, which is proposed to store large files, has the following problems because all inode levels are the same horizontal structure.
[18] First, all the indirect nodes have to be created at the time the level diverges, causing significant performance degradation. In other words, when the level is increased, the amount of intermediate nodes generated is rapidly increased, and a lot of time is required because the structure must be changed to add an indirect node to all existing nodes.
[19] Secondly, an empty intermediate node occurs. In other words, GFS generates an unused node each time the level is increased to create a flat structure. Thus, as the level increases, a large number of empty intermediate nodes are created on disk because most intermediate nodes are not used and are created just to create a flat structure.
[20] Third, the null pointer node increases the time to access the data block. In other words, when the flat file structure is used, the number of intermediate pointers that are not actually used increases rapidly when the level is increased, thereby increasing the number of intermediate nodes that are not used. In particular, when an operation occurs on the last data block of a file, the data block can be accessed only through the same number of intermediate nodes. In the case of multimedia data, new data is frequently added to an existing file. In this case, the performance of the system is degraded.
[21] Accordingly, the present invention is proposed to solve the above problems, and can handle large files that are not limited by the size of the file, thereby increasing the utilization efficiency of the disk storage device to efficiently manage large files It is an object of the present invention to provide a file storage system and a method of adding and deleting data blocks of a dynamic multi-stage inode using the same, and a computer-readable recording medium having recorded thereon a program for realizing the method.
[1] 1 is an embodiment diagram of a dynamic multi-stage inode structure in a large file system in accordance with the present invention.
[2] FIG. 2 is an embodiment of a structure in which the root inode shown in FIG. 1 has only a direct pointer; FIG.
[3] FIG. 3 is an embodiment of a structure in which the root inode shown in FIG. 1 has an indirect node at an increased level in a direct pointer.
[4] 4 is a diagram illustrating an embodiment of a structure in which all direct pointers are used except for the direct pointer in which the root inode illustrated in FIG. 1 has an indirect node;
[5] FIG. 5 is an embodiment of the root inode shown in FIG. 1 adding an indirect node by increasing the level. FIG.
[6] 6A and 6B are a flowchart illustrating a method of adding a data block of a dynamic multi-stage inode using a large file storage system according to the present invention.
[7] 7 is a flowchart illustrating a method of deleting a data block of a dynamic multi-stage inode using a large file storage system according to the present invention.
[22] The present invention for achieving the above object, in a file storage system for implementing a dynamic multi-stage inode structure, comprising a pointer having a different level than the inode information area for storing information about the root inode The pointer includes a double indirect internal pointer having a data block through a double indirect pointer node, a single indirect internal pointer having a data block through a single indirect pointer node, and a direct pointer directly pointing to the data block. It is characterized by allocating the pointer of the root inode through the direct pointer, the single indirect pointer node and the double indirect pointer node according to the increase and decrease of the level according to the size of the block.
[23] In addition, the method of the present invention, in the method of adding a data block of a dynamic multi-stage inode applied to a large file storage system, if an empty data block pointer exists in the root inode being used, a data block for storing a file is allocated. A first step of setting an address of a block of the allocated data to an empty data block pointer; If there are no empty data block pointers on the root inode, and all the pointers of the parent node are in use, and the current node is the root node and all nodes have the same level, increase the level to increase the current root node. A second step of copying a pointer value stored in the inode and generating a pointer for assigning an address value of a new data block; If there are no empty data block pointers in the root inode, the pointers of the parent nodes are not in use, the current node is not the root node, or all nodes are not at the same level. A third step of generating an intermediate node in the upper node and assigning the intermediate node to the first pointer; And a fourth step of setting the data block pointer by reducing the number of data blocks at the root inode with null as the data block pointer and reconstructing the file when it is out of the threshold.
[24] In addition, another method of the present invention is a method for deleting a data block of a dynamic multi-stage inode applied to a large file storage system, wherein the first step of setting a pointer value indicating a data block to null without reconfiguring an inode of a file ; Reducing the number of data blocks in the inode information area of the root inode; And reconstructing the data block and the inode of the file if the total number of data blocks of the current inode is less than or equal to a threshold value that is a ratio of the number of lowest data blocks that the current level can have.
[25] On the other hand, the present invention, if there is an empty data block pointer in the root inode in use in a large file storage system having a processor for the addition of the data block of the dynamic multi-stage inode, the data block for file storage is allocated, A first function of setting an address of a block of the allocated data to an empty data block pointer; If there are no empty data block pointers on the root inode, and all the pointers of the parent node are in use, and the current node is the root node and all nodes have the same level, increase the level to increase the current root node. A second function of copying a pointer value stored in the inode and generating a pointer assigning an address value of a new data block; The data block for saving a file when there are no empty data block pointers in the root inode, and all pointers of the parent node are not in use, the current node is not the root node, or all nodes are not at the same level. A third function of generating an intermediate node from an upper node and setting the intermediate pointer to the first pointer in the intermediate node; And a program for realizing a fourth function of reducing the number of data blocks at the root inode by setting the data block pointer to null and reconstructing the file when it is out of the threshold range. Provide a recording medium.
[26] In addition, the present invention provides a first function of setting a pointer value indicating a data block to null in a large file storage system having a processor for deleting a data block of a dynamic multi-stage inode without reconfiguring the inode of the file. ; A second function of reducing the number of data blocks in the inode information area of the root inode; And a program for realizing a third function of reconstructing the data block and the inode of the file if the total number of data blocks of the current inode is less than or equal to a threshold which is a ratio of the lowest number of data blocks that the current level can have. Provide a readable recording medium.
[27] The objects, features and advantages described above will become more apparent from the following detailed description taken in conjunction with the accompanying drawings. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.
[28] 1 is a diagram illustrating an embodiment of a multi-stage inode structure in a large file storage system according to the present invention.
[29] As shown in FIG. 1, the structure of the multi-stage inode 100 is for dynamically converting to a size of a data block and includes a root inode 101 and a plurality of indirect point nodes 102 and 103. The root inode 101 is a data block (or extent: used as a data block below) pointer 101B, 101C, which has a different level from that of the inode information area 101A for storing information on the child node. 101D, 101E ...). 101A contains information of the corresponding inode such as inode number, total number of data blocks, 101B is an inner block pointer with two levels, 101C is an inner block pointer with one level, and 101E is a data block. Pointer is a direct pointer, and 101D is not used yet.
[30] That is, the root inode 101 has K (K is a natural number) a plurality of data block pointers 101B, 101C, 101D, 101E .., and these data block pointers 101B, 101C, 101D, 101E. .) Have a multi-level structure to point to each data block 104.
[31] In one embodiment of FIG. 1, the total of n data sums of m * i data blocks indicated through 101B, i (j to j + i-1) data blocks indicated through 101C, and the data blocks indicated by the last 101E. One embodiment points to a block.
[32] That is, the size FS of the file that the inode structure of FIG. 1 can store is determined by Equation 1 below.
[33] FS = n * DS
[34] Where FS is the size of the file the inode can store, n is the total number of data block (extent) pointers, and DS is the size of the data block (extent).
[35] Here, the data block pointer is a double interference internal pointer 101B in which the data block 104 exists through the double interference pointer node 102, and a single interference in which the data block 104 exists through the single interference pointer node 103. It is divided into an internal point 101C and a data block pointer 101E that directly points to the data block.
[36] Here, look at the structure of the data block using the dynamic increase of the level in detail as follows.
[37] FIG. 2 is a diagram illustrating an embodiment of a structure in which the root inode illustrated in FIG. 1 has only a direct pointer.
[38] As shown in FIG. 2, the root inode 101 basically includes a data block pointer 101E in which a plurality of block pointers directly point to a data block together with the inode information area 101A.
[39] Here, assuming that the root inode 101 has a data block pointer of maximum K (natural number), the K value is determined by the following equation (2).
[40] K = (RS-IM) / (PS + PM)
[41] Where RS is the root inode size, IM is the inode information, PS is the block pointer size, and PM is the pointer information.
[42] In other words, the K value is the number of pointers that can be stored in the root node when the portion of the root inode 101 except for the region 101A storing the inode information is divided into the block pointer size and the information for the pointer. Can be obtained.
[43] 2 shows an example in which each data block pointer 101E is used as a direct pointer to indicate a data block 104 when the value of K is 8, and has up to eight data blocks 104 from pointers 0 to 7. Can be.
[44] For example, RS, which is the total size of the inode, is 4096 bytes (4 KB), IM, an area for storing information of the inode, is 100 bytes, the size PS of one pointer is 8 bytes, the information for the pointer is 1 byte, If the size of one data block (extent) is 4096 bytes (4 KB),
[45] K = (4096-100) / (8 + 1) = 444
[46] FS = K (n) * 4,096 Byte
[47] = 444 * 4,096Byte
[48] = 1,818,624Byte
[49] = 1,776 KB
[50] The total number of data block pointers possible without increasing the level is 444, and the total size of the file is 1,776 KB. In other words, if the total size of the file is less than 1,776 KB, all the inode pointers can be used as direct pointers to data blocks without increasing the level. If the file size exceeds this, the level must be increased. .
[51] 3 is a diagram illustrating an embodiment of a structure in which the root inode illustrated in FIG. 1 has a single indirect point node by increasing a level at a direct pointer.
[52] 3 increases the level from 0 to 1 since only the data block 104, which is a direct pointer, cannot be processed when the size of the file is increased and one data block 104C is added. Through 103, a single indirect internal pointer 101C where data block 104 is present is added.
[53] That is, when the ninth data block 104C is added so that all data blocks can no longer be directly processed by the data block pointer 101E, the data block pointer 101E having a level of 0 has a level of 1; Copied to the single indirect pointer node 103, the address value of the newly allocated ninth data block 104C is stored in the data block pointer 101E, which is the second block of the root inode 101.
[54] FIG. 4 is a diagram illustrating an embodiment in which a data block pointer is used except for a single indirect internal pointer in which the root inode illustrated in FIG. 1 has a single indirect pointer node.
[55] As shown in FIG. 4, the data blocks 104 and 104B are continuously added in FIG. 3 to use the data block pointers 101D and 101E of the root inode 101.
[56] That is, FIG. 4 is an inode 101 having level 0 and level 1, where pointer 0 is level 1 and level 1 indirect pointer node 103 and pointer 1 101E to point to eight data blocks 104 as level 1. FIG. Has a direct pointer node at level 0 from pointer 7 (101D).
[57] FIG. 5 is a diagram illustrating an embodiment in which the root inode illustrated in FIG. 1 increases a level to add a single indirect pointer node.
[58] FIG. 5 shows that a single indirect pointer node 103A is used by a single indirect internal pointer 101C of the root inode 101 to again create a second indirect pointer node 103B, and this second single The value of the last data block pointer 101D is copied from the data block pointer 101E to the indirect pointer node 103B.
[59] That is, the data block pointers 101D and 101E of the root inode 101 are used to the pointer of the fifteenth data block 104B shown in FIG. 4, but the sixteenth data block 104D is added. The first single indirect pointer node 103B is created. At this time, the data block pointer value from the second data block pointer 101E of the root inode 101 to the last data block pointer 101D is copied to the generated second indirect pointer node 103B, The address value of the 16th data block 104D allocated for the remaining files is stored in an empty area of the newly allocated single indirect pointer node 103B.
[60] 6A and 6B are a flowchart illustrating a method of adding a data block of a dynamic multi-stage inode using a large file storage system according to the present invention.
[61] 6A and 6B are flowcharts of adding a data block by increasing the level by one step. First, it is determined whether an empty data block pointer exists in a root inode in use (610).
[62] As a result of the determination in step 610, if there is an empty data block pointer of the root inode in use, a new data block is allocated and an address of the allocated data block is set in the empty data block pointer (611, 612).
[63] As a result of the determination in step 610, if there is no empty data block pointer of the root inode being used, it is determined whether all the pointers of the upper node are in use (620).
[64] In operation 620, if the pointers of the upper nodes are all in use, it is determined whether the corresponding nodes are the root inodes 101 and all nodes have the same level (630).
[65] As a result of the determination in step 630, if the corresponding node is the root inode 101 and all nodes have the same level, all possible pointers in the current level are in use, and then a new intermediate node is created after increasing the level ( In operation 631, the pointer value stored in the current root node is copied to the generated intermediate node.
[66] A new data block is allocated from the file storage system (633), an address value of the allocated data block is allocated, and inserted into an empty area of the intermediate node (634).
[67] On the other hand, if it is determined in step 620 that all of the pointers of the upper node are not in use, or as the result of step 630, the current node is the root inode 101 and the level of all nodes is not the same. There is no pointer available at the current level, so the level is not increased.
[68] That is, the data block is allocated from the file storage system (640), the intermediate node is created at the upper node, and then the first pointer is allocated to set the pre-allocated data blocks to the allocated first pointer (641, 642).
[69] 7 is a flowchart illustrating a method of deleting a data block of a dynamic multi-stage inode using a large file storage system according to the present invention.
[70] As shown in FIG. 7, without reconfiguring the inode of the file, first, a pointer value indicating a corresponding data block is set to null (710). In operation 720, the number of data blocks is reduced in the inode information region of the root inode.
[71] In this case, it is determined whether the total number of data blocks of the current inode is less than or equal to a threshold, which is a ratio of the number of lowest data blocks that the current level can have (730).
[72] As a result of the determination in step 730, if the total number of data blocks of the current inode is less than or equal to a threshold value that is the ratio of the number of lowest data blocks that the current level may have, the data block and the inode of the file are reconfigured (740). .
[73] That is, if the number of data blocks decreases by α% or less than the number of initial pointers that the current level may have, the reconstruction is performed. If α is 60%, the lowest value is N. Therefore, the number of data blocks can be reconstructed to a number of N * 0.6 or less.
[74] The method of the present invention as described above may be implemented as a program and stored in a computer-readable recording medium (CD-ROM, RAM, ROM, floppy disk, hard disk, magneto-optical disk, etc.).
[75] The present invention described above is not limited to the stated embodiments and the accompanying drawings, and it is common in the art that various substitutions, modifications, and changes can be made without departing from the technical spirit of the present invention. It will be evident to those who have knowledge of.
[76] As described above, the present invention can effectively solve an empty intermediate node that may occur as the level is increased by using a dynamic allocation technique of processing a pointer for the data block only when the data block (or extent) is increased. Assigning a pointer value from the top of a large file is effective for large files that continue to grow in size by reducing access time to the data blocks (or extents) behind the file. In addition, even in the method of deleting a data block (or extent), the inode can be effectively processed by considering the utilization rate instead of reconfiguring the inode at the time of deletion.
[77] In addition, the present invention minimizes the number of blocks inside the inode that is changed when the size of a large file changes, thereby dramatically reducing the number of blocks to be recovered in the event of a file system failure, thereby ensuring fast recovery. In addition, it can be effectively used for mass storage devices such as Storage Area Network (SAN) or Network Attached Storage (NAS) .In particular, in large file systems that require a large number of data block pointers, unlike conventional techniques such as GFS, null pointer nodes Because the number of files is minimized, the storage device can be used efficiently when storing large files.
权利要求:
Claims (6)
[1" claim-type="Currently amended] In a file storage system for implementing a dynamic multi-level inode structure,
An inode information area for storing information about the root inode and a pointer having a different level,
The pointer includes a double indirect internal pointer having a data block through a double indirect pointer node, a single indirect internal pointer having a data block through a single indirect pointer node, and a direct pointer directly pointing to the data block.
And a pointer of a root inode through a direct pointer, a single indirect pointer node, and a double indirect pointer node according to the increase and decrease of the level according to the size of the data block.
[2" claim-type="Currently amended] In the data block addition method of a dynamic multi-level inode applied to a large file storage system,
A first step of receiving a data block for storing a file if an empty data block pointer exists in the root inode being used, and setting an address of the allocated block of data to the empty data block pointer;
If there is no empty data block pointer in the root inode, the pointers of the parent node are in use. If the current node is the root node and all nodes have the same level, the current node is created in the intermediate node created by increasing the level. A second step of copying a pointer value stored in the inode and generating a pointer for assigning an address value of a new data block;
The data block for saving a file when there are no empty data block pointers in the root inode, and all pointers of the parent node are not in use, the current node is not the root node, or all nodes are not at the same level. A third step of generating an intermediate node from the upper node and setting the intermediate node to the first pointer in the intermediate node; And
The fourth step of setting the data block pointer by reducing the number of data blocks at the root inode with null as the data block pointer and reconstructing the file if it is outside the threshold
How to add a data block of a dynamic multi-level inode using a large file storage system comprising a.
[3" claim-type="Currently amended] The method of claim 2,
The second step,
A fifth step of creating a new intermediate node after increasing the level if there are no empty data block pointers of the node being used and all pointers of the parent node are in use, and the current node is the root node and the levels of all nodes are the same;
A sixth step of copying a pointer value stored at the current root node to the created new intermediate node; And
A seventh step of allocating a new data block and inserting an address value of the allocated data block into an empty area of an intermediate node;
How to add a data block of a dynamic multi-level inode using a large file storage system comprising a.
[4" claim-type="Currently amended] In the data block deletion method of a dynamic multi-level inode applied to a large file storage system,
A first step of setting a pointer value pointing to a corresponding data block to null without reconstructing the inode of the file;
Reducing the number of data blocks in the inode information area of the root inode; And
A third step of reconstructing the data block and the inode of the file if the total number of data blocks of the current inode is less than or equal to a threshold that is a ratio of the lowest number of data blocks that the current level can have
Data block deletion method of a dynamic multi-level inode using a large file storage system comprising a.
[5" claim-type="Currently amended] In a large file storage system with a processor for the addition of data blocks of dynamic multilevel inodes,
A first function of allocating a data block for storing a file if an empty data block pointer exists in the root inode being used, and setting an address of the allocated block of data to the empty data block pointer;
If there are no empty data block pointers on the root inode, and all the pointers of the parent node are in use, and the current node is the root node and all nodes have the same level, increase the level to increase the current root node. A second function of copying a pointer value stored in the inode and generating a pointer assigning an address value of a new data block;
The data block for saving a file when there are no empty data block pointers in the root inode, and all pointers of the parent node are not in use, the current node is not the root node, or all nodes are not at the same level. A third function of generating an intermediate node from an upper node and setting the intermediate pointer to the first pointer in the intermediate node; And
A fourth function of setting the data block pointer to reduce the number of data blocks at the root inode by nulling the data block pointer and reconstructing the file if it is outside the threshold
A computer-readable recording medium having recorded thereon a program for realizing this.
[6" claim-type="Currently amended] In a large file storage system with a processor for deletion of data blocks of a dynamic multilevel inode,
A first function of setting a pointer value pointing to a corresponding data block to null without reconstructing the inode of the file;
A second function of reducing the number of data blocks in the inode information area of the root inode; And
A third function of reconstructing the data blocks and inodes of the file if the total number of data blocks of the current inode is less than or equal to a threshold, the ratio of the number of lowest data blocks that the current level can have
A computer-readable recording medium having recorded thereon a program for realizing this.
类似技术:
公开号 | 公开日 | 专利标题
US9262434B1|2016-02-16|Preferential selection of candidates for delta compression
US20200175070A1|2020-06-04|Low ram space, high-throughput persistent key-value store using secondary memory
US9767154B1|2017-09-19|System and method for improving data compression of a storage system in an online manner
US9965483B2|2018-05-08|File system
US10296219B2|2019-05-21|Data deduplication in a block-based storage system
US9268783B1|2016-02-23|Preferential selection of candidates for delta compression
US9928250B2|2018-03-27|System and method for managing deduplication using checkpoints in a file storage system
US9563654B2|2017-02-07|Dense tree volume metadata organization
US9727573B1|2017-08-08|Out-of core similarity matching
US8918478B2|2014-12-23|Erasure coded storage aggregation in data centers
US10346363B2|2019-07-09|Deduplicated file system
US10031675B1|2018-07-24|Method and system for tiering data
US8892531B2|2014-11-18|Scalable file management for a shared file system
JP2018152126A|2018-09-27|Method and system for storing and retrieving data
JP4975882B2|2012-07-11|Partial movement of objects to another storage location in a computer system
Dong et al.2011|Tradeoffs in Scalable Data Routing for Deduplication Clusters.
US10394757B2|2019-08-27|Scalable chunk store for data deduplication
US20160261694A1|2016-09-08|Method and Apparatus for Tiered Storage
US8447864B2|2013-05-21|Distributed differential store with non-distributed objects and compression-enhancing data-object routing
Guy et al.1990|Implementation of the Ficus Replicated File System.
EP2488950B1|2015-12-02|A tiered data management method and system for high performance data monitoring
US6988124B2|2006-01-17|Locating potentially identical objects across multiple computers based on stochastic partitioning of workload
DE69838367T2|2008-05-29|Parallel file system and method for independent recording of metadata
US6530035B1|2003-03-04|Method and system for managing storage systems containing redundancy data
US5802599A|1998-09-01|System and method for allocating storage in a fragmented storage space
同族专利:
公开号 | 公开日
KR100422801B1|2004-03-12|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
法律状态:
2001-06-04|Application filed by 한국전자통신연구원
2001-06-04|Priority to KR20010031213A
2002-12-12|Publication of KR20020092550A
2004-03-12|Application granted
2004-03-12|Publication of KR100422801B1
优先权:
申请号 | 申请日 | 专利标题
KR20010031213A|KR100422801B1|2001-06-04|2001-06-04|Method for deleting and adding of dynamic multi-level inode for Huge File Storage System|
[返回顶部]