比利时专利BE1024144B1 IMPROVED CONSTRUCTION OF DATABASE SCHEME MODELS FOR DATABASE SYSTEMS AND REST APIs

专利PDF首页>>比利时专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
The present invention relates to a method for building a database schema model by a user by means of a computer, comprising the following steps: providing a set of collections and / or optionally one or more relationships that have at least two of said set connect collections; editing one or more of said set of sets, each of which is associated with a schematic definition shown by a single table on an entity relationship diagram on a graphical interface on said computer and comprising at least one object and / or one field, said scheme definition can be edited via a tree structure on said graphical user interface on said computer; automatically generating, by means of said computer, said database schema model for a database system or a REST API; characterized in that said set of sets comprises at least one set comprising a nested object that can be edited via said two-level or more tree structure.
公开号:BE1024144B1
申请号:E2016/5905
申请日:2016-12-06
公开日:2017-11-22
发明作者:Pascal Desmarets
申请人:Integrit Sa/Nv；
IPC主号:

专利说明:

IMPROVED CONSTRUCTION OF DATABASE SCHEME MODELS FOR DATABASE SYSTEMS AND REST APIs
TECHNICAL FIELD
The invention relates to the technical field of database systems, in particular to the construction of database scheme models for database systems and REST APIs.
BACKGROUND
Although many contemporary data storage and data management systems are still exclusively based on relational database systems and the related specific programming language SQL (Structured Query Language), many application developers are switching to database systems that are "Not Only SQL", ie NoSQL (originally: "Non SQL" ) database systems such as MongoDB, Couchbase, CouchDB, Amazon DynamoDB, RavenDB, Microsoft Azure DocumentDB, MarkLogic, and EnterpriseDB. NoSQL databases offer major benefits for application developers: flexibility, fast and easy evolution, as well as the ability to store and access data with minimal effort and setup. This also applies to SQON-based database for storing documents, a class of NoSQL databases relating to the present invention that is organized as a document repository (as opposed to relational database) and is based on JSON (JavaScript Object Notation), a format that is also often used in the implementation of REST APIs (Application Programming Interfaces) and web services (where REST stands for Representational State Transfer, and a system that meets the so-called limitations of REST is called RESTful).
As with traditional databases, planning and setting up a NoSQL database requires a database model (or folder) to help evaluate design options prior to actual implementation. The aforementioned database model helps to think about the implications of different alternatives, to identify potential problems before starting significant amounts of development and to plan ahead to minimize the need for a new job. Ultimately, a modeling process aided by a good database model accelerates developments, increases the quality of the application and reduces implementation risks.
Furthermore, said database model is particularly important in the case of a NoSQL database, since it can handle the increased complexity associated with the data and the (usually, very large) scale of NoSQL databases. Furthermore, the NoSQL-related problem circumscribes that data structures are only implicitly described in the application code, meaning that anyone involved in database management must investigate the code, which is of course not the most productive way for a fruitful dialogue between analysts, architects, designers, developers and DBAs.
Despite the popularity of NoSQL databases, there remains a need in the art for a good database model for this type of databases. The existing traditional methods and tools of traditional relational databases cannot be applied to NoSQL and REST APIs. This argument is actually sometimes used as a justification against the use and implementation of NoSQL solutions.
A previously tried solution tried to model a JSON data repository, but did not provide a graphical way to represent the schemas of the collections. Another tried-and-tested solution offered a tree-like structured diagram for a single collection, but offered no way to create a graphical model of the different collections that make up a database. Another tried and tested solution defines "visual design" of a REST API as "described in a human-readable format - in other words, plain text" and therefore failed to provide a diagram. EP 0 910 024 describes a method, and associated system, for converting object-oriented models into a real-life database. The method and system automatically convert object-oriented models into inputs that are compatible with an intended product data management platform. A problem with EP 0 910 024 is that it is limited to relational database schema models, without means to deal with typical features of NoSQL environments, such as nested objects. US 5,596,746 describes an algorithm for transforming database schema information into object model notation with the aid of meta models of the database tables and target object model notation for bridging the gap between the database schema representation and the object model representation. Similar to EP 0 910 024, US 5,596,746 is limited to relational database schema models, and therefore has no mechanisms for dealing with No-SQL related attributes such as nested objects.
The present invention aims at solving at least some of the above problems. It is therefore an object of the invention to provide a method and system for graphing a model of schemas for NoSQL document databases and REST APIs that are both functional and clear, thereby enhancing the work of, and the dialogue between, analysts, architects, designers, facilitates developers, testers and operators of systems based on such technologies.
SUMMARY OF THE INVENTION
In the database community there is an increasing need for database management systems (DMBSs) that support a combination of data models, relational and non-relational, in a single DBMS platform. Convergence between RDBMSes and other data models (e.g., a NoSQL document database) is therefore taking place, creating the need for multi-model database schema models, and new methods suitable for the construction of a wide range of database schema models, including classic and multi- model database schema models. The present invention is directed to such methods and systems for building a database schema model. The present invention particularly relates to a GUI-supported construction method of database schema models that extends the typical, table-like representation of collection, known e.g. from both RDBMSes and NoSQL document databases, to the concept of nesting one or more objects within a single collection with tabular representation, which is not possible with methods according to the prior art. Processing of said single set is enabled by a related tree structure that represents the schematic definition of said single set. In this regard, the present invention offers a very general solution to a known problem, such as the basic problem example presented below in the "Detailed Description" section (Example 2), which provides a clear and accessible introduction to the general concept.
In a first aspect, the present invention relates to a method for building a database schema model by a user by means of a computer, comprising the following steps: (a) providing a set of collections and optionally one or more relationships that have at least one connects two of the said series of collections; (b) editing one or more of said set of collections, each of which is associated with a scheme definition represented by a single table on an entity relationship diagram on a graphical interface on said computer and comprising at least one object and / or one field, said schematic definition being editable via a tree structure on said graphical user interface on said computer; (c) automatically generating through said computer the said database schema model for a database system or a REST API; characterized in that said set of sets comprises at least one set comprising a nested object that can be edited via said two-level or more tree structure.
The said "series" can herein comprise one or more sets. In a preferred embodiment of the method according to the present invention, said set of sets comprises at least two sets and step (b) comprises generating and / or processing one or more relationships connecting at least two sets belonging to said set collections through said entity relationship diagram on said user interface on said computer; and at least one of said one or more relationships connects a first field that belongs to a set of sets that belongs to said set of sets to a second field that belongs to a second set that belongs to said set of sets.
In the context of the present invention, the term "collection" includes any tabular representation of data within the context of databases and database management systems (DBMSes), such as a collection in a NoSQL document database, a REST API object, or a "classical" table in a relational database management system (RDBMS). Regardless of whether said collections are generated by the user via said graphical user interface or whether said collections are imported from another format with respect to e.g. a NoSQL document database, a REST API or an RDBMS is the graphical interactive representation on a graphical user interface of said collections and their interrelationships, an integral part of the present invention.
A tree structure is used for the aforementioned graphic interactive representation of the aforementioned collections. The term "tree structure" here includes a tree-like structured diagram, the said diagram comprising nodes and branches. Here, one of the mentioned nodes is a main node which is at the so-called highest level, connected to different nodes at a lower level by branches, which nodes, in turn, can be connected to branches at even lower levels. Here the number of levels is tested excluding the top level. A node that is not connected to lower-level nodes is called a leaf node. Each node can comprise a field or an object, an object at a certain level comprising different nodes at a level lower than the said determined level. A nested object is therefore an object that is enclosed as a node within a larger collection, the presence of said nested object resulting in two levels or more for the tree structure of said larger collection. In prior art methods, said nested object is not found in RDBMSes, since they assume "flat tables", limited to a top-level table title, and one or more first-level attributes. This is a result of the fact that said nodes in said tree structure, in RDBMSes, all relate to fields and not objects, where fields are standard leaf nodes. Similarly, the said tree structure with the said two levels is found in NoSQL document DBMSes.
The user's interaction with said graphical interactive representation of said collections allows a variety of interactive manipulations, such as the editing of properties of each node, the rearrangement of nodes within a collection, and the movement of nodes from one collection to another collection.
An entity relationship diagram is used for the aforementioned graphical interactive representation of the mentioned relationships. Here, the term "entity relationship diagram" must be interpreted in the context of DBMSes databases, with entities corresponding to collections, and relationships between a first field in a first collection and a second field in a second collection. This type of graphic interactive representation is partially found in RDBMS and NoSQL methods according to the state of the art. A related graphical representation is found in many RDBMSes, where each collection (a 'flat table') is a single entity that relates to one or more other collections through a foreign key relationship, including the concepts of a foreign key and a primary key known to those skilled in the art. In a preferred embodiment of the present invention, two sets can be linked by more than one relation. Hereby the mentioned relationships can relate to foreign-key relationships, but they can also relate to foreign-master relationships. Linking collections through more than one relationship is desirable in view of the presence of nested objects, which usually exhibit a plurality of relationships, for example due to the application of so-called denormalization as described in this document. Since RDBMSes does not allow nesting of objects, there is less need for more than one relationship between two sets. Moreover, the reason that said graphical interactive representation of said relationships is not found in prior art NoSQL methods is in the fact that NoSQL models inherently have no scheme, where collections are mainly addressed as isolated entities, without a useful way for defining relationships between a first field of a first set and a second field of a second set.
The user's interaction with said graphical interactive representation of said relationships also allows a variety of interactive manipulations, such as the editing of properties of each relationship and the rearrangement of collections to generate new relationships or to edit existing relationships .
In general, the present invention has the advantage that a very general structure can be maintained, such as a JSON structure, through improved structure (fewer collections, only "real" relationships) (especially in the ERD), sequence retention (both in the ERD and the tree structure), interaction (as in the ERD as the tree structure), and indentation (representing nesting) (especially in the ERD), while methods according to the prior art do not. The result of the present method is not only true to the corresponding general structure, such as a JSON document definition, but is also more compact in terms of ERD, allowing a clearer and better overall overview. In addition, the interactive editing of the database schema model is also more flexible, with richer interaction thanks to the choice between the ERD and the tree structure. This is discussed in more detail in Example 2. A related advantage is the ability to handle polymorphisms better, as discussed in more detail in Example 4.
An important advantage of an embodiment of the present invention is that the user is no longer limited by the limitations of prior art NoSQL and RDMBS methods, allowing both nesting of objects and the generation of versatile relationships. In a preferred embodiment, individual collections (which may relate to NoSQL collections, but may also relate to REST API objects) can be edited in JSON schema definition via a tree structure, while relationships between collections can be edited simultaneously via said entity relationship diagram. In a further preferred embodiment, this leads to a single integrated interactive representation as illustrated, e.g. by Figs. 60, 61, 63, 65, 66 and 68, which greatly improves the efficiency with which the said database schema model can be built and edited. By providing such an explicit representation, the present invention also solves the problem with known NoSQL methods, where data structures are only tacitly described in the application code.
A related advantage of the present invention is the ability to model relationships. As mentioned above, NoSQL models inherently have no scheme, whereby collections are primarily addressed as isolated entities, without the ability to define relationships between a first field of a first set and a second field of a second set. The latter is particularly problematic when a user wants to import a relational database schema model into a NoSQL environment, or, for whatever reason, wants to follow an RDBMS-like approach by linking collections. While the foreign-key relationships included in the relational database scheme usually play an important role in the organization of data, there is no concept available in NoSQL to create a model for these or other relationships. In a preferred embodiment of the present invention, the database schema model allows to generate and process foreign-key relationships, and, related, to keep foreign-key relationships intact when relational database schema models are imported.
In a further preferred embodiment of the method according to the present invention, said one or more objects comprise an array and / or a subdocument. This makes it possible to incorporate structures that are known from NoSQL. This also plays an important role in the denormalization described below.
In a further preferred embodiment, said database schema model is formatted according to any or any combination of the following formats: JSON Schema, YAML, Mongoose, RAML, Swagger, Apache AVRO or Parquet. This has the advantage that it offers the user a format that matches his or her specific needs.
According to another preferred embodiment, said automatic generation in step (d) automatically generates a human-readable manual for said database schema model based at least in part on a logical link defined by said one or more relationships. This has the advantage that it facilitates the automatic generation of manuals by including the information inherent in the relationships in the generated manual.
In another preferred embodiment, said first set and said second set are linked by more than one relationship generated and / or edited in step (c). This has the advantage that limitations imposed by traditional methods according to the state of the art can be further removed. Unlike NoSQL DBMSes, the database schema model of the present invention explicitly supports relationships between collections; Unlike RDBMSes, two collections can be related by more than one relationship. In a related further embodiment, said first field belonging to said first set is linked to both said second field and a third field belonging to said second set respectively by a first and second relationship. By allowing multiple relationships associated with a single field, which are preferably shown in a single integrated graphic interactive representation, the flexibility for the end user is increased.
In a further preferred embodiment, said provision in step (a) comprises generating one or more of said set of sets on said graphical user interface on said computer. The user can therefore choose to generate one or more collections directly on the screen, and build a database schema model based on that. This has the advantage that the user can manually intervene in the aforementioned construction, or even build a database schema model from the start.
According to a further embodiment, said provisioning in step (a) comprises denormalizing a relational database schema model provided by the user, said denormalizing comprising extracting said set of sets from a set of tables belonging to said relational database schema model, as well as any foreign-key relationships that belong to the said relational database schema model, for inclusion in the said database schema model. This has the practical advantage that a user who wants to build a database with NoSQL functions such as the said database schema model can take an existing available relational database schema model as a starting point, as is often desirable in practice. By "importing" both sets and foreign-key relationships, all vital aspects of the relational database are transferred to the aforementioned database schema model to be built. According to a related further embodiment, said denormalizing in step (a) further comprises processing a foreign key relationship between an upper set comprising a primary key and an underlying set comprising a foreign key, said processing comprising any or any combination of the the following comprises: including a nested subdocument in said child collection, wherein said nested child document includes one or more relationships in said parent collection; including a nested array in said parent set, said nested array comprising one or more relationships in said child set. This has the advantage that the functions of the relational database are translated "without loss" into a context with nested objects, which is not possible in an RDBMS. This can be done according to different strategies that are included above and which are further explained in the description below. In one sense, this translation includes an "upgrade" of the original relational database schema model, because the original collections and relationships are preserved, while the possibilities of interacting with said database are extended to include NoSQL functionality. In a further related embodiment, said foreign key relationship processing comprises selecting one or more denormalization preferences on said graphical user interface on said computer, said denormalization preferences comprising any or any combination of the following: a table selection; an embedding preference with respect to said incorporation of said nested subdocument into said child collection and / or said nested array in said parent collection; a maximum number of stacked levels; a maximum number of recursion levels. This has the advantage that the user can adjust the way in which the original relational database is translated into the said database schema model, with a view to a more concise result (e.g. fewer tables, fewer nesting, fewer recursions) or a more extensive result (more tables, more nesting, more recursions).
In a second aspect, the present invention provides a computer-implemented method for building a database schema model upon user input according to a method according to the present invention, comprising the following steps: (i) receiving a series of collections and / or optionally one or more relationships that link at least two of the said set of collections, optionally upon entry of said user via a graphical user interface; (ii) receiving processing instructions for each of said set of sets regarding a scheme definition upon input of said user via a tree structure on said graphical interface; (iii) generating, upon request of said user, said database schema model for a database system or a REST API; characterized in that said set of sets comprises at least one set comprising a nested object that can be edited via said two-level or more tree structure.
In a preferred embodiment of said computer-implemented method, said set of sets comprises at least two sets, and step (ii) receives generation and / or processing instructions relating to one or more relationships comprising at least two sets links belonging to said set of collections upon entry of said user via said entity relationship diagram on said graphical user interface. Furthermore, at least one of said one or more relationships links a first field that belongs to a first set that belongs to said set of sets with a second field that belongs to a second set that belongs to said set of sets.
In a third aspect, the present invention provides for the use of a database schema model built with a method according to any of the present invention in a database system or a REST API. In a preferred embodiment thereof, said database system is any or only combination of the following: MongoDB, Couchbase, CouchDB, Amazon DynamoDB, RavenDB, Microsoft Azure DocumentDB, MarkLogic, EnterpriseDB, Oracle SQL, MySQL, PostgreSQL, Microsoft SQL, Oracle for NoSQL , ElasticSearch, Snowflake, FinchDB, MariaDB, IBM Cloudant, Google Cloud Datastore, Cassandra, BerkeleyDB, RethinkDB, Mapr. This has the advantage that it provides a superior user experience because of the integration of the present invention into the work environments to which it applies, which results in better user comfort for the user.
In a fourth aspect, the present invention provides a computer system for building a database schema model upon user input, said computer system having a processor, a non-volatile memory, program code on said memory for execution on said processor, a graphical interface wherein said computer is configured to perform a computer-implemented method according to the present invention. The advantages of such a computer are similar to those of the related computer implemented method.
DESCRIPTION OF THE FIGURES
The present invention can be easily described with reference to the accompanying figures, wherein:
Figure 1 illustrates trends in the rise of JSON.
Figure 2 illustrates the domain model.
Figure 3 illustrates the graphical user interface.
Figure 4 illustrates the menu bar.
Figure 5 shows the file menu (item: Export).
Figure 6 shows the file menu (item: Reverse Engineer).
Figure 7 illustrates the Object Browser.
Figure 8 shows a first view of a middle box, showing an example of an entity relationship diagram.
Figure 9 shows a second view of a middle box, showing an example of a tree-like structured diagram.
Figure 10 shows a third view of a middle section, showing an example of a schedule view.
Figure 11 shows a fourth view of a middle box, showing an example of a JSON preview.
Figure 12 shows a fifth view of a middle box, showing an example of a database creation script.
Figure 13 shows a sixth view of a middle section, showing an example of a database documentation.
Figure 14 shows examples of upper tabs of the middle compartment. Figure 15 shows the first set of lower tabs of the middle compartment (corresponding to the upper database tab).
Figure 16 shows the second set of lower tabs of the middle compartment (corresponding to the upper set tabs).
Figure 17 shows the contextual menus (item: Align).
Figure 18 shows the contextual menus (item: Add child level). Figure 19 shows the contextual menus (item: Reference).
Figure 20 shows a first view of the properties at the database level.
Figure 21 shows a second view of the properties at the database level. Figure 22 shows a third view of the properties at the database level.
Figure 23 shows a first view of the properties at the collection level. Figure 24 shows a second view of the properties at the collection level. Figure 25 shows a third view of the properties at the collection level. Figure 26 shows a first view of the properties at field level.
Figure 27 shows a second view of the properties at field level.
Figure 28 shows a third view of the properties at field level.
Figure 29 illustrates a first step in creating a model.
Figure 30 illustrates a second step in creating a model.
Figure 31 shows the 'Details' tab of the properties of a database. Figure 32 shows the 'Relationships' tab of the properties of a database. Figure 33 shows the 'Users' tab of the properties of a database. Figure 34 shows a first step in creating and editing a model. Figure 35 shows a second step in creating and editing a model. Figure 36 shows a third step in creating and editing a model.
Figure 37 shows the 'Details' tab of the properties of a collection. Figure 38 shows the 'Users' tab of the properties of a collection.
Figure 39 shows the 'Indices' tab of the properties of a collection. Figure 40 illustrates the creation of fields for a collection.
Figure 41 illustrates the non-root elements that can be created. Figure 42 shows the properties of a field.
Figure 43 shows a first step in adding entries with a 'plus' icon in the field properties.
Figure 44 shows a second step in adding entries with a 'plus' icon in the field properties.
Figure 45 shows the case of multiple types in the field properties.
Figure 46 shows the reference option in the contextual menu.
Figure 47 shows a graphic view of the definitions at the collection level.
Figure 48 shows a graphical view of the definitions at the database level.
Figure 49 shows an example of a schedule view.
Figure 50 shows an example of a JSON preview.
Figure 51 shows an example of documentation.
Figure 52 shows an example of a database script.
Figure 53 illustrates the export function.
Figure 54 shows the Print dialog box.
Figure 55 shows the Print Setup dialog box.
Figure 56 shows the options interface.
Figure 57 shows the JSON schema view configuration.
Figure 58 shows two 'flat tables' in a source model.
Figure 59 illustrates a foreign key relationship.
Figure 60 shows a denormalization result.
Figure 61 shows a denormalization result.
Figure 62 illustrates a foreign key relationship.
Figure 63 shows a denormalization result.
Figure 64 shows three 'flat tables' in a source model.
Figure 65 shows a normalization result.
Figure 66 shows a denormalization result.
Figure 67 shows a 'flat table' in a source model.
Figure 68 shows two denormalization results.
Figure 69 shows three 'flat tables' in a source model.
Figure 70 illustrates the menu access.
Figure 71 shows a warning regarding saving the current model.
Fig. 72 shows a parameter interface with respect to the denormalization process.
Figure 73 shows an exemplary problem with a solution according to the present invention.
Figure 74 shows the "JSON document to model" of the example problem. Figure 75 shows the "Modeling with traditional ER software" of the example problem.
Figure 76 shows the "Modeling according to the present invention" (part 1 of 2) of the exemplary problem.
Figure 77 shows the "Modeling according to the present invention" (part 2 of 2) of the exemplary problem.
Fig. 78 shows an exemplary interface with respect to reference and denormalization.
Figure 79 shows an example with reference to denormalization. Figure 80 shows an example ERD.
Fig. 81 shows an example of a tree diagram.
DETAILED DESCRIPTION OF THE INVENTION
In the present document, the term "document" refers to a collection; in particular, data belonging to a collection is usually stored in one or more documents, preferably JSON documents. In this document, both "foreign-master relationships" and "foreign-key relationships" fall under the "relationships" class, where foreign-master relationships relate to relationships that are not foreign-key relationships. Furthermore, the acronym "GUI" stands for graphical user interface (Graphical User Interface).
An important objective of the present invention is to provide programmers, application developers, and database administrators with a powerful new method for building database models. To this end, the present invention combines the graphical interactive representation of collections in a tree structure with the graphical interactive representation of relationships between collections in an entity relationship diagram. Together these graphical representations offer both the schematic model and the documentation of that model. The present invention facilitates the work of, and dialogue between, various users such as analysts, architects, designers, developers, testers, and operators of systems based on such technologies. In a preferred embodiment of the present invention, schemes and documentation may be generated in various machine-based formats such as JSON Schema, Mongoose, collection definition scripts, RAML, Swagger, YAML, Apache AVRO or Parquet, or human-readable formats. In a further preferred embodiment, the user is assisted in importing a relational database schema model, and its translation into a database schema model according to the present invention.
According to another aspect of the present invention, which is not intended to limit the scope in any way, the present invention provides a method and system for creating a graphical model of a database schema model for NoSQL document databases and REST APIs. In particular, it provides a method and system for combining 1) the graphical representation of collections (in NoSQL Document databases) or objects (in REST APIs) by an Entity Relationship Diagram, with 2) the graphical representation of the JSON Schema definition of the said collections or objects by a tree-like structured diagram. Together, these graphical representations provide the schema model of a NoSQL document database or REST API, and the documentation of that model. The invention is further described by the following non-limiting example, which further illustrates the invention, and is not intended, or should not be construed, as limiting the scope of the invention.
According to another aspect of the present invention, which is not intended to limit the scope in any way, the present invention provides a method for creating a graphical model of schemas for NoSQL document databases and REST APIs, including providing of a graphical representation of one or more collections or one or more objects, and simultaneously providing a graphical representation of a JSON Schema definition of said collections or objects by a tree-like structured diagram.
According to another aspect of the present invention, which is not intended to limit the scope in any way, the present invention provides a system for creating a graphical model of schemas for NoSQL document databases and REST APIs, said being system is configured to provide a graphical representation of one or more sets or one or more objects, and to simultaneously provide a graphical representation of a JSON-Schema definition of said sets or objects through a tree-like structured diagram.
The present invention is directed to the construction of database scheme models, wherein the database scheme model can be a simple known representation such as a single JSON document, but also a "multi-model" database scheme model, as long as the database scheme model exhibits said nesting according to a two-level tree structure or more. The term "multi-model" herein refers to a database schema model that includes features of both relational databases and NoSQL databases, and therefore cannot be assigned to only one of these categories, as is currently emerging in the database community.
The present invention relates to database schema models such as those considered by MongoDB and other NoSQL systems such as DynamoDB. Various specific terms have been derived from the context of existing systems, in particular MongoDB. However, it is important to note that the terms used in the present invention have been chosen in accordance with MongoDB purely for the sake of clarity, and that they should not be interpreted as limiting the invention in any way. The term "collection", "document", "field", "subdocument" and "array" are therefore similar, but not necessarily identical to the equivalent terms with the same name in MongoDB and much more common NoSQL systems and / or RDBMSes. In addition, some NoSQL systems use different terms; DynamoDB in particular uses a different agreement on names. Couchbase and Cassandro also use a slightly different terminology. The following list of equivalent terms is included for clarity, without wishing to limit the present invention in any way.
Below you will find four examples of embodiments of the present invention. Although Example 1 focuses on a multitude of aspects of the invention, Example 2 considers the simple case of a single JSON document with nesting, to which the invention also applies. Example 3 relates to reference in denormalization. Example 4 considers an example of ERD with associated tree structure according to an embodiment of the present invention. EXAMPLE 1 1. Project drivers
The project relates to the activities of IntergrIT (doing business like Hackolade). 1.1 References
The following description may be useful for informing the reader of the context: JSON and BSON specifications: http://json.org/ and http://bsonspec.org/ JSON Scheme specification: http: // json- schema.org/
Trends in the rise of JSON, cfr Fig 1 1.2 The user problem and context of the project 1.2.1 NoSQL databases without schema JSON-based document repository NoSQL databases (MongoDB, Couchbase, CouchDB, Amazon DynomoDB, MarkLogic, OrientDB, RavenDB, Microsoft Azure
DocumentDB, EnterpriseDB, etc., cf. http: // db- engines.com/en/ranking/document+store) promote the concept of a "no-schema" approach with many benefits, such as the flexibility for the schema to evolve easily over time, or the ability to start saving and opening data without first defining a schedule.
The data transformations and data characteristics in NoSQL systems are often categorized. The tools for such modeling did not exist. The data is often recorded in models; it is implied in the code. To understand how the data is stored and read, the code must be examined. The model is essentially "implicit" for the programmers and developers working on the system: it is in their brains and is then manifested in the code of the system.
The advantage of a DBMS without schedules is that you can start saving and opening data without first defining a formal schedule. While this sounds fantastic, and certainly facilitates a quick start to a data project, experience shows that a lack of prior thinking is usually followed by much later thinking. As data volumes increase and opening times become significant, consideration must be given to the reorganization of data to speed up opening and updating, and sometimes to change the interaction between the speed, consistency and atomicity of different access styles. It is also common for patterns to be created in the structure of data, and there is a growing realization that, although the DBMS does not require a specific data scheme, much of the data that is stored has a certain significant scheme in common. 1.2.2 Use of schema definitions
A data model of a certain type is vital: definitions, elements, transformations and relationships still need to be understood and documented. There is no way an organization can get a 360 ° view of their entire business at all data, system, application, decision process transaction and customer levels without any kind of models or cards to explain them.
A database model offers the opportunity to play with database design options on paper, on a board, or in a drawing instrument, before one has to worry about the syntax of data definitions, data already stored or application logic. It is a fantastic way to think about the implications of data organization and / or to recognize important patterns in the data before choosing a specific design. It is much easier to redraw part of the model than to recreate a database schema, move significant amounts of data, and change application code.
A scheme is a carrier for describing content for determining compliance. It is often used under the guise of part of research content for entry into a program through a validation process.
It can also be used effectively as a tool for creating content and speeding up the correction process. This is especially true for content created by people or generated by interactive systems, where dynamic processes are part of the content definition.
Sometimes the use of a scheme or other data definition capacity is seen as "locking" a system, or rendering it inflexible. With the expressiveness of JSON-Schema, however, the purpose of the scheme is not to limit flexibility, but rather to correctly express only what is required from the data content, to create useful communications of corrections that are required, and to leave the rest of the content to be interpreted by the programs. 1.2.3 Benefits of using a schedule
Validation can be done with program logic. A significant part of many programs is used to validate entries that must be processed by the program. This includes processing startup commands, reading configuration files, reading data files, receiving messages and accepting user input.
In any case, the purpose of the validation process is to determine whether the content entered is correct and complete before starting further processing. The use of a scheme, and a corresponding validation function to apply the scheme to its content, allows the validation function to be processed by a purpose-built function rather than custom-written code. The schema definition, which is in the domain of data definition, is more expressive and easier to understand than program logic.
With simplified definition and maintenance, the chance is greater than will be validation: • Fuller since the time to produce schema content can be shorter than the time to write program logic to perform this task, • Easier to read. Reading program logic for validation often requires reading a mix of stream processing, control of the type and control logic of the expressions. Reading a diagram focuses on the data representation, without the other processing logic being involved. • Properly used by others who produce content. Often program logic is a black box, or documentation is limited, making it difficult to ascertain whether all valid entries for certain elements, whether elements are mandatory or optional, without providing excellent documentation. Providing a schedule for other parties makes the task of creating correct content to deliver to the program much easier. Not all content benefits equally when using a schedule. A program that has only one or two configuration options does not have a significant amount of validation code, and a program that expects content that is free of form may not have a sufficient definition to be very useful. Selecting the programs and places in the program where scheme definitions can be useful is a design choice. 1.2.4 JSON schedule
In the space for exchanging messages, JSON is often used to implement REST web services. Its broad use reflects the flexibility of JSON-formatted content to support iterative definition for message exchanges between systems, the co-existence of a range of technologies and the ease of understanding.
In addition to configuration files, many programs also have data management requirements. While some programs have requirements suitable for using databases, others have more moderate requirements and are better suited to the flexibility of using JSON files. As the content expands (both in the structure of the data model and the amount of data being stored), there is a higher risk of data consistency for JSON-capable programs. As is the case with many configuration files, data updates can also be done without text editing tools and are often subject to errors - often minor ones. Finding these errors often requires more work than correcting them - a missing comma, incorrectly placed closing} or], or a typo in a keyword. Fortunately, there are two tools to tackle this properly: • JSON syntax checkers, which find errors in the syntax. • JSON Schema, and the associated validation tools, which find errors in the content.
Although JSON is increasingly accepted in a wide range of programming languages / runtime platforms, the use of JSON-formatted files for configuring files and similar uses is now an available option for many projects. JSON and JSON schema are strong fundamental technologies, with many opportunities for building interesting and useful capabilities around them. 1.3 Objectives
The present invention allows analysts, solution designers, architects, developers, and database administrators to visually design documentation, model around it, define and create for the schemas of JSON document-based NoSQL databases and REST APIs. The related application allows users to do with JSON documents what has been possible for relational databases and XML documents respectively.
The graphical interface allows the user to visualize a diagram of the model collections and their relationships, as well as the contents of the collections. The outputs of the model are: JSON documents and JSON-Schema v4-compliant JSON (http://json-schema.org/documentation.html), scripts (MongoDB, Mongoose, etc.), and a detailed documentation of the model in HTML (as well as Markdown, PDF and RTF formats) as well as RAML and Swagger.
The application will store in a local or central repository the properties of one or more documents defined by the user, as well as their relationships, if applicable. 2. Requirements 2.1 Business requirements 2.1.1 Domain
Cf Fig 2. The application allows a user to create a Model. The model is made from one Diagram for one Database (for MongoDB and other NoSQL systems) or Application (for REST APIs). A Database is made from one or more Collections (or Documents) and possibly from logical Relationships between them. A Collection consists of Fields and Objects. It is possible for Relations to consist of certain Fields from different Collections. The Database and each of its Collection, and each of the Fields and Objects of the Collection, as well as Relationships, all have Properties. Collections and Relationships are presented in a Diagram. Diagram and properties lead to outputs in the form of Schemas, Examples, Scripts and Documentation files. 2.1.2 Cases of use for companies • Save / retrieve models o User creates a new model from the beginning o User retrieves a previously saved model o User saves change since it was last saved or cancels it o User saves a model under a new name as a newly saved version of the previously saved model • Modeling o User defines a database and its properties
Graphically adjusts the layout of entities o User creates Collections, their contents and related properties
Graphic
Without grid
Through the text-based editing of the JSON scheme o User creates Relationships and their properties o User can visualize and print different outputs:
Model entity relationship (ER) diagram
Collection document type definition (DTD) diagram JSON document example JSON scheme HTML / Markdown / PDF / RTF documentation
MongoDB script
Mongoose schedule
Scripts for other database vendors • Utilities o Where-used capacity o JSON validator o JSON schema validator o Model comparison o Reverse engineering of a database schema o Reverse engineering of JSON documents and JSON schema files o Storage location o License key management o Checking for and installing of application updates
With regard to the aforementioned Model Entity Relationship (Er) diagram, we note that, strictly speaking, the concept of ER is generally related to relational databases, and not JSON or NoSQL databases. For ease of communication, however, the concept is used in this document, but the vocabulary will not be used in the application.
With regard to the aforementioned Collection Document Type Definition (DTD) diagram, we note that the concept of DTD is strictly related to XML and not JSON. For ease of communication, however, the concept is used in this document, but the vocabulary will not be used in the application. 2.2 Functional requirements 2.2.1 User interface 2.2.1.1 Application sections
The application screen is divided into different parts, cf. Fig. 3.
The different parts are described below. 2.2.1.1.1 Menu bar
Cf Fig 4. These menus are configured in JSON documents to allow easy changes to relevant properties without impact on programming. Options are grayed out if they are not available to the user in the context of the application.
Each menu option is described in more detail below. 2.2.1.1.1.1 File menu Cf. Fig 5 & 6.
2.2.1.1.2 Object browser
Cf Fig 7. This course is dynamically built based on entities created in the Central and Properties courses. It provides a structured overview of the model, and quick access to elements of the model. This box also includes a search bar as well as an 'where used' image of the elements.
If the data does not fit in the box, a vertical and horizontal scroll bar appears.
When elements in the course are clicked, the courses show centrally and properties show more information.
Thanks to a toggle in the View menu, the user can choose to hide the box (to provide more room for the central box), or make it appear again. 2.2.1.1.3 Central course
The Central course has different purposes:
Graphic view of the database entity relationship diagram, cf. Fig. 8 Graphic view of the field hierarchy diagram for a set, cf. Fig. 9 Graphic view of database and set definitions, cf. Fig. 8 Grid view of the fields of a set, cf. fig. 10 JSON example of an example document for a collection and corresponding scheme, cf. fig. 11
Example of a script for creating a database for a collection, see fig. 12
Documentation of database and collection, see Fig. 13
The box has 2 sets of dependent tabs: upper and lower.
The upper set has one fixed tab, the tab for the model database, and one tab for each collection that has been created or consulted, cf. Fig. 14. The collection can be closed, reopened and moved, while the database tab remains fixed. If too many tabs are open to fit the width of the box, depending on the library used, the tabs will retain their size (full length of Collection Name) with left / right arrows so that the user can scroll (Firefox style) ), or the tabs will become smaller with a tooltip (Chrome style).
There are 2 lower sets of fixed tabs, one set of the lower tab for the upper database tab, cf. 15, and one set of lower tabs for the upper set tabs, cf. Fig. 16.
The lower tabs cannot be closed or moved.
The central box has both a vertical and a horizontal scroll bar if the data or diagram does not fit in the box, especially in responsive mode.
The central box can display graphical data (database and collection), a timetable, JSON (example / scheme), documentation or database script. These tabs are functionally linked, and changes in one tab can affect the other, and vice versa. They must be dynamically updated.
The box is permanent (cannot appear or disappear like the Object Browser or the Property Box.) However, its size can vary according to the appearance or disappearance of the Object Browser and / or the Property Box. The user can manually change the width distribution of the 3 compartments.
The Central Section supports contextual menus (right-click), cf. fig. 17-19.
These contextual menus are configured in JSON documents to allow easy changes to relevant properties without impact on programming. Options are grayed out if they are not available to the user in the context of the application.
Arrow keys control the following behavior:
In the ER diagram when a collection is selected: move In the DTD diagram: move from one element to another: o Left and right arrow keys: move more superficially or deeper into the tree structure o Arrow keys up and down: the elements of the move current level in the tree structure 2.2.1.1.4 Property box
This box is where most data is entered. It can have different configurations depending on the element being edited / displayed.
There are 3 levels of property boxes: database (including relationships), collection, and field. The appropriate property box is displayed depending on the element selected in the Central Box or Object Browser. The property boxes for database and for collections have fixed bottom tabs.
Properties at the database level, cf. fig. 20 to 22 Properties at the collection level, cf. fig. 23 to 25 Properties at the field level, cf. fig. 26 to 28
Properties are checked by a JSON scheme and allow easy changes to relevant properties without impact on programming, and property entries are stored in each definition of the Collection in a JSON document.
Fields can be edited directly (no buttons for editing / canceling / saving).
If the data does not fit in the box, a vertical and horizontal scroll bar appears. 2.2.2 Core functions 2.2.2.1 Creating a model
A model is made of a diagram and relationships that represent a database of collections that consist of fields.
When a new model is created from the beginning, usually for a database that forms the basis of an application or API, the user is shown a blank screen, cf. Fig. 29.
The top tab without the name of the Central box is active, as well as the bottom Chart tab.
At this point the user starts by entering the properties of the database, cf. fig. 30. Creating the name in the Property field works the name in the fixed tab of the Central plane, in the Object browser, and in the Title bar the application (at the very top left) automatically updated 2.2.2.2 Properties of a database 2.2.2.2.1 Details
Most of the information that is stored here is for documentation purposes only, or metadata for internal use by the application, cf. Fig. 31. Some can be used when creating a script. 2.2.2.2.2 Relationships
Relationships are links between different Collections of a Database. As a reminder, unlike an RDBMS, relationships are not explicit in a NoSQL database without schemas, but in the application code. However, they can (and should) be documented here, and serve as a basis for design and discussion of problem solving, cf. Fig. 32.
This tab is not enabled until 2 or more Collections have been created and documented in the database. There are 2 types of relationships: foreign key and foreign master.
Just as in an RDBMS, a foreign key relationship refers to the unique identifier of a document in another Collection in the database. In particular with the release of MongoDB v3.2 and the introduction of the $ lookup functionality and the BI-Connector, the documentation of foreign keys becomes much more relevant.
A foreign-master relationship documents a denormalization, or the repetition of the contents of a field in another collection. The source is considered the master (or higher level) and the lower level is where the data is duplicated. Particular attention must be given to ensuring that the underlying levels are updated when the data in the master changes.
Important note: there can be more than one relationship between 2 tabs. There may even be different relationships that refer to one field in a parent table, coming from different fields in a child table. 2.2.2.2.3 Users
The only information that is useful for a database (besides the name in the details section) is user certificates.
Extra certificates can be saved at Collection level, see fig. 33.
Note that the security rights can change from version to version, and therefore must be linked to a version number. 2.2.2.3 Creating a Collection
The next step is to create a Collection. This can be done in 2 ways:
By clicking on the Add collection button in the toolbar,
By right-clicking anywhere in the Central box to see a contextual menu, and choosing the "Add Collection" option, cf. fig. 34.
An empty collection now appears in the Central box, cf. fig. 35. A tab appears at the top of the Central vka, but it is not activated. An entry is created in the Object Browser hierarchy.
In order for the user of the new collection to be able to start editing, 3 options are now offered: - double-click on the Collection window in the central box diagram, - click once on the Collection line in the Object Browser, - or click once on the Collection tab at the top of the central section.
The upper tab Collection is now active, while the lower tab Schedule is active, and the Central box shows a single fixed element: the root of the Collection schedule, cf. fig. 36.
As the user enters the Collection name in the properties, the name is updated in the Central box and in the Object Browser. The user enters the extra properties. 2.2.2.4 Properties of a collection 2.2.2.4.1 Details
The user can enter here certain data that is useful for the script for creating a database, as well as certain information that is useful for the application, cf. fig. 37. 2.2.2.4.2 Users
Security certificates at collection level, see fig. 38. Note that the security rights can change from version to version, and therefore must be linked to a version number. 2.2.2.4.3 Indices
The information is useful for documentation purposes as well as for the script for creating databases, cf. fig. 39. 2.2.2.4.4 Sharding (to be determined) 2.2.2.5 Creating fields in graphical view
The user is then ready to create fields for the Collection in different ways:
Clicking on the Add child button in the toolbar, which brings up the contextual menu,
Or right-click the Schemaoot to display a contextual menu, then click Add Child, then Field, and then click the appropriate Field Type.
By clicking on the 'plus sign' on the schedule box (also for complex objects such as documents and arrays.)
The options available in the contextual menu, cf. Fig. 40, depend on the type of element selected prior to calling the contextual menu. 2.2.2.5.1
Only one root is possible per document. With a view to a NoSQL database scheme, the root element can only be of the following type: document. Properties can be entered for the root element. One or more underlying levels can be created. The options Insert, Attach and Reference are in gray as the functions are not possible for a root element. 2.2.2.5.2 Non-root elements
For any elements other than the route element, 4 options are possible (Add child, Insert, Add and Reference), see fig. 41. The options of the submenu depend on the nature of the selected element. First, let's define the different elements that can be created. 2.2.2.5.2.1 Field
This is the most common element in a schedule. It defines a name-value pair that will be found in the JSON data. The properties define the details and limitations of both the name and value in the pair. As defined by the JSON schema specification, the types available for a field value are: string, numeric, boolean, object (document), array, and zero. With a view to MonoDB, additional BSON types are available: objectID, binary, date, time stamp, regex, JavaScript, JavaScript with scope, symbol, minKey and maxKey.
The nominal case is for an element with only one type. However, the JSON specification allows a field to have multiple types. The UE application can define and save multiple types for a field, cf. Figs. 42 to 45.
The name of a standard field is a fixed string.
The nature of underlying levels is different for document, array or other elements. Documents may have one or more Fields, Pattern Fields and / or Choices as underlying levels. Arrays may have one or more Array items and / or Choices as underlying levels. For all other types of fields, the only underlying level is 'Choice'.
For all elements a 'plus sign' ('+') therefore appears in the right corner of the element box with the 'Schema' label and the option 'Add child' in the contextual menu is enabled. However, if the 'plus sign' is clicked and:
The element is a document, array or choice, then the contextual submenu opens under the Add child level;
The element is of any other type, then a Choice element is created.
For all elements the options appear in the 'Add child' submenu as follows:
If the element is a document, then the Field, Pattern Field, and Choices options are enabled;
If the element is an array, then the Array Item and Choices options are enabled;
If the element is a Choice, then the Options and Sub Schedule options are enabled;
The element is of any other type, then only the Choices option is enabled.
Since it is possible that an element has multiple types, the options available are the sum of the options of each type of an element.
The Insert and Attach functions in the contextual menu work in a similar way, except that while an insert is selected, the insert insert behavior is linked to the type of the parent element.
The properties of each field type are configured in a JSON-Schema file. 2.2.2.5.2.2 Pattern field
Pattern fields work in exactly the same way as standard fields, with the only exception: the name (in the name-worth pair) is not a fixed string, but a regex pattern. This is particularly useful in MongoDB in combination with 'dot notation'. 2.2.2.5.2.3 Choice
In JSON-Schema there are 4 possible choices: "allOf", "one of", "anyOf" and "not". Each of these elements contains an array, with each element of the array representing content that will be compared. The choice of "allOf", "one of", "anyOf" or "not" determines how the validation processor will handle the results of the matches: allOf requires that all elements in the arry be successfully matched. oneOf requires that one, and only one, of the elements in the array be successfully reconciled. anyOf requires that one or more of the elements in the array be successfully reconciled. not requires that no element in the array be successfully reconciled.
Schema definitions can use "allOf", "oneOf", "anyOf" and "not" individually or in combination, which provides significant flexibility for defining elements that have complex definitions or contextual relationships. These choices apply to both fields and field properties.
In both cases, the only possible underlying level of Choice is a Sub-scheme or another Choice. When the 'plus sign' of a Choice element is pressed, a Sub-scheme element is created. Similarly, in the contextual submenu when a Choice element is selected, only the Subplan and Choices options are enabled. And when the underlying level of Choice is selected, the Sub-scheme and Choices options are the only enabled options in the Insert and Attach submenu. 2.2.2.5.2.4 Array item
This is the only possible underlying type of an Array field. Different types of array items are possible for the same parent element. Each can have 0 to n occurrences, but can be limited by the Array's Minitems and Maxltems properties. It has a behavior that is quite similar to that of a standard field, except that the field properties are comparatively limited. This is checked in the Field Prop JSON Schedule in Appendix. 2.2.2.5.2.5 Sub-scheme
If the choice applies to a field, then the sub-schema is a document with all the schema options of a JSON object.
The choice may also apply to an individual field property, in which case it is a simplified scheme with the appropriate property of a field type. For example, a string field could have a format of ipv4 or ipv6. 2.2.2.6 Properties of a field 2.2.2.6.1 Details
The field properties are on the right-hand side in a special compartment. Depending on the field type selected in the previous step, the appropriate property scheme appears in the property box. The field properties are checked by a JSON Schema Field Properties found in attachment.
Vertical and horizontal scroll bars appear when content is larger than the size of the displayed box (variable according to the responsive nature of the application.) Different types of entry are possible in the property box: - Text box - Check box - Selection list: a list of values checked either by the JSON Schema Field Properties (e.g., possible string formats) or by entries in the DB or Collection (e.g., foreign keys, references, or field dependencies.)
In addition, the property field allows you to add other entries with a 'plus sign' icon. With enumerations, for example, the user can enter one entry, then click on the 'plus sign', which means that more entries are allowed underneath.
Multiple entries are also possible with selection lists, such as dependencies.
A bit more complicated is the case of multiple types. In such a case, some fields are common and others are specific. 2.2.2.6.2 References JSON-Schema makes it possible to create reusable definitions, so that they no longer have to be copied in multiple places. The definitions can be about a single field or a more complex structure such as a document. A definition can exist on 3 levels:
Internal of a Collection (and therefore not available for reference anywhere other than in its own collection),
Database model workspace, in which case it is available for any collection of the database,
Externally, i.e. in a public location on the internet,
Definitions are not possible for Choices and Array items. Only for Fields, Pattern Fields and Subschemas. The Reference option in the contextual menu is therefore only enabled for these last 3 elements.
The Reference option in the contextual menu leads to the following actions, cf. fig. 46. 2.2.2.6.2.1 Convert to internal definition
When this option is selected, the system takes the element properties, copies them to the Definitions section of the same JSON Schema Collection, and replaces the element properties with a Reference with the newly created definition. 2.2.2.6.2.2 Convert to DB definition
When this option is selected, the system takes the element properties, copies them to the Definitions section of the JSON Schema database model workspace, and replaces the element properties with a Reference with the newly created definition. 2.2.2.6.2.3 Convert to external definition
The system takes the element properties, copies them to the Definitions section of the JSON Schema that is stored in a central repository and replaces the element properties with a Reference with the newly created definition. 2.2.2.6.2.4 Convert definition to local properties
Once an element refers to a definition, it is possible that the user may want to convert the reusable definition to a local instance, usually for diverging the reusable definition. This action removes the Reference from the element properties and replaces it with a full copy of the previously referenced Definition. The definition remains, since it can be used elsewhere in the DB or collection. 2.2.2.6.2.5 Go to definition
When an element refers to a Definition, the user will be able to view the properties and details of the Definition by choosing this option. 2.2.2.7 Definitions
External definitions cannot be saved in the application. Definitions appear in a separate section in the Object Browser. 2.2.2.7.1 At collection level
Definitions at the collection level are stored in a graphic image similar to that of a complete collection scheme, cf. fig. 47. 2.2.2.7.2 At the database level
Definitions at the database level are stored in a graphic image similar to that of a complete collection scheme, cf. Fig. 48. 2.2.2.8 Field where used and search
In the Object Browser there will be a dynamic tree structure from where a field is referenced: collection and sub-objects, if applicable, and definitions. 2.2.2.9 Roster view
This image offers exactly the same functions as the graphic image, but with a different layout and visualization, cf. Fig. 49. 2.2.2.10 JSON example
A problem with JSON Schema is easy to visualize a JSON document that meets the defined scheme. To that end, the Properties box provides a Sample input box for each field so that the user can register a sample to be used in a JSON document. This image shows, side by side, the JSON data document, and the corresponding JSON Schema as defined in the images Diagram of Rooster, cf. fig; 50.
Changes are allowed in the JSON Schedule box, and validation of these changes is done quickly to provide immediate feedback to the user. Changes made to this print preview are immediately visible in the Diagram and Schedule images, and vice versa. This must be guaranteed by the way in which entries are personified. The JSON data can also be edited. 2.2.2.11 Documentation
The purpose with this tab is to generate a human-readable documentation of the created JSON Schema, as an equivalent implementation for JSON for something that is fairly common for XSD, cf. fig; 51.
The documentation can be exported to RTF, PDF, Markdown and (X) HTML formats, as well as RAML and Swagger. It is available for an individual Collection, or for the full DB. 2.2.2.12 Database script
A document validator was introduced with MongoDB 3.2. The system will provide mapping between JSON Schema Syntax and MongoDB Validator Syntax, cf. Fig. 52. Similar functionality is available to other sellers when appropriate. 2.2.3 Additional functions 2.2.3.1 Open, save, save as, close
These functions will be influenced by the type of storage location and the way in which persistence is performed. Other than that, the characteristics must meet the functions that one would normally expect. 2.2.3.2 Reverse engineering
This advanced attribute takes as input either: a simple JSON document, and generates the schema for that document the current model; a JSON Schema, and fills the properties in the current model; or connects to a NoSQL database and takes representative documents from the collections that are present in the database. It generates the diagrams for the different collections in a workspace.
The user can then edit the model and save it independently of the source, cf. Fig. 6. 2.2.3.3 Export
This advanced feature allows the user to export parts of the model or the entire model, in various formats, cf. fig. 53. 2.2.3.4 Printing
Printing is possible for diagrams, as well as for documentation, JSON documents and schemes. The checks provided for users must be as follows: print setting (the printer and its properties, as well as the paper size, can be selected), print preview (the user can see what it will look like before it is actually printed) , print selection, zoom level and whether objects can be printed over split pages, cf. fig. 54 & 55. 2.2.3.5 Compare models
The purpose of this advanced feature is to provide a side-by-side view of 2 models (presumably 2 different versions of the same model), with different ones being indicated between them. 2.2.3.6 Options
Parameters to be retained by the user usually appear in the course of the design and development of an application, and they increase with the application, cf. Fig. 56.
For the schedule display checks of the DTD diagram, various checks could be included in a JSON Schedule display configuration, cf. Fig. 57. 3. Application to relational databases and denormalisation proposal requirements
In this part the concept of "denormalization" that was mentioned earlier in part 2.2.2.2.2 is further explained. In this document, the term "denormalization" refers to any conversion from a source database model to a target database model where nesting occurs.
In particular, this part describes an embodiment of the present invention wherein a relational source database model is converted into a target database model according to the present invention. The strange keys that are present in the relational source database model are used for nesting. The target database model is obtained in particular by nesting linked foreign tables.
The concept of denormalization illustrates the wide range of applicability of the present invention. While the applicability above was mainly illustrated in the case of NoSQL databases and REST APIs, relational databases and RDMS and all related aspects are the same in the scope of the present invention. In a preferred embodiment, this is done by "reverse engineering", similar to the functionality explained above.
In an alternative embodiment this is done by "forward engineering". In yet another alternative embodiment, this is done by working directly on a relational database and RDMS.
Although denormalization is presented in this document as an embodiment of the present invention, it can also be considered as a stand-alone concept and related invention. The graphical user interface used in this section specifically depicts only one embodiment of the concept of denormalization. In an alternative embodiment, denormalization as described in this document permits the generation of a target database model from a source database model without requiring a graphical user interface in any step of the generation method. 3.1 Introduction
Assuming that the function 'Reverse engineering of DDL file' is available, a requested additional function is to propose a denormalized model based on the original relational model.
This document describes the process of taking a source database model (created from the beginning or by reverse engineering of a DDL file or RDBMS) stored with a method according to the present invention, and generating a new model while connecting foreign tables are nested. After that process the user can further edit and refine the denormalized model. 3.2 Concepts
Starting with 2 'flat tables' in the source model, see fig. 58, with a foreign key relationship illustrated in fig. 59, there are at least two ways and even 3 ways in which denormalization can be seen, given in section 3.2 respectively .1-3. 3.2.1 Subdocument of underlying table containing strange parent table structure
As a flat table in a normalized source model is copied to a collection in a denormalized target model, if a foreign key relationship is encountered, the structure of the parent table is included as a nested subdocument of the collection.
The expected result of the new function is illustrated in Fig. 60.
The parent table structure is embedded here as a sub-document structure in the underlying table. Foreign keys and foreign-master relationships are automatically created in the embedding process. 3.2.2 Array in parent table containing child table structure
Another way to perform denormalization is to create, in the upper table, an array with the structure of the underlying table, with the expected result illustrated by Fig. 61.
Notes: 1) The underlying field with the original relation is not repeated in the parent table array, so as not to create cross-references. 2) The parent-child relationship is reversed. Now the parent of the fields in the newly created array are the fields in the former child table. 3) Consequently, the ... cardinality is in the parent, see Fig. 62. Note how the field name would benefit from dot.notation to represent nesting. 3.2.3 Combination
When the 2 logics are combined, the following result is generated, illustrated by Fig. 63. Depending on how data is read, all 3 of the above methods for denormalization are possible. 3.2.4 Cascading
When an upper table itself is the lower table of another, the result illustrated in FIG. 64 is obtained. It may be desirable to represent the relationships in stages in multiple levels of nesting, thereby achieving the result obtained in FIG. 65. (It will be good to be able to rearrange relationships to reduce elbow lines ...)
And of course the mirrored cascading array is possible, with the result illustrated in Fig. 66.
Since this could continue for a long time in complex models, it is probably a good idea to limit the number of cascades, and to let the user decide on the number of cascades between 0 and a maximum of, for example, 3. It is of course easier to have an external cascade then delete one manually. 3.2.5 Recursion
If there is a relationship between fields from the same table, one can again embed an underlying array in the parent, or embed an parent subdocument in the child. However, it is not proposed here to do this.
Finally, there is no indication of how many times cascade would be necessary without looking at the actual data.
The relational table shown in Fig. 67 can be cast into a model in any of the 2 ways illustrated in Fig. 68. 3.2.6 Multiple paths
In the case shown in Fig. 69, the order in which tables are normalized and converted can be important, since taking the path airports> areas> countries will provide richer but more hierarchical information. 3.2.7 Cross-references
Cross-references must be detected and avoided. 3.3. Process 3.3.1 Menu access
Menu access is illustrated on fig. 70. 3.3.2 Save current model
The display shown in Fig. 71 is shown when changes have been made, so the denormalization proposal can be stored in an empty model. 3.3.3 Selection and parameters
The user can select all tables or specific tables to be denormalized, as well as set certain parameters to be used during the model's denormalization process. This is illustrated by fig. 72. 3.3.4 Execution
The process will include at least the following steps: - Copy tables from source (normalized) to target set (denormalized) - Select first relation in source o Embed in destination model according to selected parameters
If array in parent collection: • Attach array items with type = document • Copy child tree in document array item, except relationship field
If subdocument in underlying collection: • Replace relationship field with • Collection made from copied parent structure o Create foreign key and foreign master relationships o Iter according to Cascade and Recursion parameters o Go to next relationship and loop EXAMPLE 2
Example 2 is illustrated by Figures 73 to 77. Figure 73 shows an exemplary problem with a result according to the present invention. Figures 74 to 77 show enlarged parts of Figure 73. Figure 74 first shows the "JSON document to model". Furthermore, Figure 75 shows the "Modeling with traditional Er software", while Figures 76 and 77 together show the "Modeling according to the present invention" of the exemplary problem. The problem investigated in this example is how to build and / or edit the part of the JSON document code given here in Figure 74, supported by a GUI. As will be appreciated by those skilled in the art, said portion of the code defines a simple JSON document with respect to an individual, including both top-level information regarding, ie, the fields with field names "_id", "username". , "contact", "access" and "status", as nested information, ie two nested objects, respectively with respect to the field names "contact" and "access". A result according to the state of the art is given on figure 75, with an Entity Relationship Diagram (ERD) comprising three sets. This is in contrast to the ERD obtained with an embodiment according to the present invention and shown in Figure 76, supplemented with a tree structure shown in Figure 77.
In general, the present invention allows the original JSON document structure to be represented by structure (fewer collections, only "real" relationships) (especially in the ERD), sequence retention (both in the ERD and the tree structure), interaction ( as in the ERD as the tree structure), and indentation (representing nesting) (in particular in the ERD), while methods according to the prior art do not. The result of the present method is therefore not only true to the corresponding JSON document definition, but is also more compact in terms of ERD, allowing a clearer and better overall overview. In addition, the interactive editing of the database schema model is also more flexible, with richer interaction thanks to the choice between the ERD and the tree structure. This is discussed in more detail below.
First, note how the database schema model built with a prior art method includes a first set for said upper-level information as well as a second and third set for each of said two objects. This is a first problem with a method according to the prior art, since the separation of information between three separate objects makes the unity of the underlying information fade. It makes no sense to remove the aforementioned two objects from their original context. This may mislead a user of the method by thinking that the information regarding "contact" and "access" is somewhat "less closely related" to the said individual, as opposed to, e.g., "username", that is "more related" "appears with the said individual. This difference is a completely wrong conception, since "contact", "access", etc. "username" actually belong to the same top level in the original JSON document. This incorrect conception can lead to serious errors in the construction and processing of databases. In contrast, the present invention retains the original context, with "contact", "access" and "username" all at the same level, both in the ERD illustrated in Fig. 76 and the tree structure illustrated in Fig. 77.
A second related point relates to the preservation of hierarchy and the related interaction that is triggered by a result according to the present invention. While the fields belonging to the two "contact" and "access" objects are inevitably visible in Fig. 75 (prior art result), more flexibility is allowed in Fig. 76 (according to the present invention), with collapsible items, indicated by an icon related to collapse, ie A square with a "-" sign in when the object content is visible (as shown), and a square with a "+" sign in when the object is collapsed (not shown). This has the advantage that it allows a very compact display, with all objects of the upper level being shown, but not the tails, which promotes usability. In addition, it creates an additional way for the user to interact with the database schema model, allowing him / her to become familiar with a database schema model and / or explaining it to others by clicking on the icon related to collapse / touching. Moreover, even in non-interactive displays (such as printed documentation), end-user insight is enhanced as the ERD of Fig. 76 is supplemented by the tree structure of Fig. 77. This allows a complementary approach, e.g., by collapsing a collapsed image. show for the ERD (ie without displaying the content of objects), supplemented with the tree structure that shows all the details.
Another advantage of the present invention is the retention of sequence. Both the ERD and the tree structure display the original sequence under field names at each level, as defined in the JSON document. For the result according to the prior art, this sequence is not retained: as will be apparent from Fig. 75, it can no longer be determined whether, for example, "contact" was in first, second, third, fourth or fifth position in the list on upper level. This sequence blurring leads to incomplete interaction with the database schema model, another problem solved by the present invention.
Another advantage of the present invention is the targeted use of indentation in the ERD. As can be seen in Fig. 76, fields belonging to an object are indented with respect to the object name, which serves at least two purposes. On the one hand, it serves to easily embed the contents of the object in a visible manner. On the other hand, it mimics the typical layout of JSON document code as illustrated in Fig. 74. A user with previous experience with JSON will therefore have fewer problems understanding the ERD of the present invention, and is therefore less inclined to make mistakes in his / its interaction with the database schema model.
In addition, note the presence of two "relationships" in the database schema model created with a prior art method, both of which look like RDBMSes, but in fact relate to "nesting relationships," which are the unity of the underlying information. By sticking to a "fat-table-only" representation, the user who creates database schema models with the method according to the state of the art is confronted with a huge entity relationship diagram, in which no overview is offered. This is especially true if it relates to a database schema model comprising "real" relationships such as foreign-key relationships, such as the foreign-key relationship that connects two sets in Figure 58. In a prior art method, the denormalization of a relational database schema model as described in this document lead to a dramatically complicated ERD, with a confusing mix of nesting relationships and "real" relationships. In contrast, the method according to the present invention creates relationships only when necessary, ie none in the case of the present example, and only a limited number in the case of Fig. 58, illustrated e.g. on Figs. 60 and 61. The user interacting with a database scheme model according to the present invention therefore retains a better overview of the relevant relationships, so that fewer errors are made on average. EXAMPLE 3
FIG. 78 and 79 show an example regarding reference and denormalization. FIG. 78, in particular, shows an exemplary interface that complements the interface on FIG. 72 introduced in section 3.3.3 of Example 1, allowing the user to select all sets (tables) or specific sets (tables) to be denormalized, now with choose the additional two-way reference option. The difference between referral and embedding is as follows. With embedding, when an array is normalized to an upper level (and also for that side when "both" is selected), an array is created that consists of a subdocument with all fields of the underlying level. Reference, on the other hand, meets the need to bring only the keys to the higher level, and not all associated fields. This means that on the upper side there would only be a simple array of foreign keys that refer to the primary key of the underlying. In the corresponding ERD according to the present invention, only an array of simple fields is shown. This is illustrated by Fig. 79. EXAMPLE 4
FIG. 80 and 81 respectively show an example of ERD with associated tree structure according to an embodiment of the present invention. Together, these figures illustrate the ability of the present invention to cope with an important and useful feature of JSON as applicable to NoSQL and Big Data, namely polymorphism. Polymorphism refers to the ability to cope with evolving and flexible schemes, both at the level of the general document structure and at the level of the type of a single field. This is known as 'schema combination', and can be represented in JSON Schema with the use of sub-schemes and keywords; anyOf, allOf, oneOf, not. FIG. 80 and 81 illustrate an ERD and associated tree structure comprising a field that evolves from just one string type, to a subdocument, or to the simultaneous existence of both field types. Methods according to the prior art have difficulty dealing graphically with such sub-schemes, while the representation on Figs. 80 and 81 is clear and self-evident.

权利要求:
Claims (15)
[1]
CONCLUSIONS
A method for building a database schema model by a user by means of a computer, comprising the steps of: (a) providing a set of sets and optionally one or more relationships connecting at least two of said set of sets; (b) editing one or more of said set of collections, each of which is associated with a scheme definition represented by a single table on an entity relationship diagram on a graphical interface on said computer and comprising at least one object and / or one field, said schematic definition being editable via a tree structure on said graphical user interface on said computer; (c) automatically generating through said computer the said database schema model for a database system or a REST API; characterized in that said set of sets comprises at least one set comprising a nested object that can be edited via said two-level or more tree structure.
[2]
Method according to the preceding claim 1, characterized in that said series of sets comprises at least two sets; that step (b) comprises generating and / or editing one or more relationships connecting at least two sets that belong to said set of sets via said entity relationship diagram on said user interface on said computer; and that at least one of said one or more relationships connects a first field that belongs to a set of sets that belongs to said set of sets to a second field that belongs to a second set that belongs to said set of sets.
[3]
Method according to one of claims 1 to 2, characterized in that said schematic definition is a "JavaScript Object Notation" (JSON) schema definition.
[4]
Method according to one of claims 1 to 3, characterized in that said one or more objects comprise an array and / or a subdocument.
[5]
A method according to any one of claims 1 to 4, characterized in that said database schema model is formatted according to any or any combination of the following formats: JSON Schema, YAML, Mongoose, RAML, Swagger, Apache AVRO, Parquet.
[6]
The method according to any of claims 2 to 5, characterized in that said at least one of said one or more relationships is a foreign key relationship; and / or that said first set and said second set are connected by more than one relationship generated and / or edited in step (b); and / or that said first field belonging to said first set is connected to both said second field and a third field belonging to said second set by a first and second relationship, respectively.
[7]
The method according to any of claims 2 to 6, characterized in that said automatic generation in step (c) automatically generates a human-readable manual for said database schema model which is based at least in part on a logical link defined by the one or more relationships mentioned.
[8]
A method according to any one of claims 1 to 7, characterized in that said provision in step (a) comprises generating one or more of said set of collections on said graphical user interface on said computer.
[9]
A method according to any one of claims 1 to 8, characterized in that said providing in step (a) comprises denormalizing a relational database schema model provided by the user, said denormalizing comprising extracting said set of collections from a series of tables belonging to said relational database scheme model, as well as any foreign-key relationships that belong to said relational database scheme model, for inclusion in said database scheme model.
[10]
A method according to the preceding claim 9, characterized in that said denormalization in step (a) further comprises processing a foreign key relationship between an upper set comprising a primary key and an underlying set comprising a foreign key, wherein the said processing comprises any or any combination of the following: including a nested subdocument in said child collection, said nested subdocument comprising one or more relationships in said parent collection; including a nested array in said parent set, said nested array comprising one or more relationships in said child set.
[11]
A method according to the preceding claim 10, characterized in that said processing of a foreign key relationship comprises selecting one or more denormalization preferences on said graphical user interface on said computer, said denormalization preferences having any or any combination of the following: a table selection; an embedding preference with respect to said incorporation of said nested subdocument into said child collection and / or said nested array in said parent collection; a maximum number of stacked levels, a maximum number of recursion levels.
[12]
A computer-implemented method for building a database schema model upon user input according to a method according to any of the preceding claims 1 to 11, comprising the following steps: (i) receiving a series of collections and optionally one or more relations linking at least two of said set of sets, optionally upon entry of said user via a graphical user interface; (ii) receiving processing instructions for each of said set of sets regarding a scheme definition upon input of said user via a tree structure on said graphical interface; (iii) generating, upon request of said user, said database schema model for a database system or a REST API; characterized in that said set of sets comprises at least one set comprising a nested object that can be edited via said two-level or more tree structure.
[13]
A computer-implemented method according to the preceding claim 12, characterized in that said set of sets comprises at least two sets; that step (ii) comprises receiving instructions for generating and / or editing one or more relationships connecting at least two sets that belong to said set of sets upon entry of said user via said entity relationship diagram on said user interface; and that at least one of said one or more relationships connects a first field that belongs to a set of sets that belongs to said set of sets to a second field that belongs to a second set that belongs to said set of sets.
[14]
Use of a database schema model built with a method according to any of the preceding claims 1 to 11 in a database system or a REST API, wherein said database system preferably relates to any or any combination of the following: MongoDB, Couchbase, CouchDB, Amazon DynamoDB, RavenDB, Microsoft Azure DocumentDB, MarkLogic, EnterpriseDB, Oracle SQL, MySQL, PostgreSQL, Microsoft SQL, Oracle for NoSQL, ElasticSearch, Snowflake, FinchDB, MariaDB, IBM Cloudant, Google Cloud Datastore, Cassandra, BerkeleyDB, RethinkDB, Mapr.
[15]
A computer system for building a database schema model upon user input, wherein said computer system comprises a processor, a non-volatile memory, program code on said memory for execution on said processor, a graphical interface, said computer configured to perform a method according to any of the preceding claims 12 and 13.

类似技术:

公开号 | 公开日 | 专利标题

BE1024144B1|2017-11-22|IMPROVED CONSTRUCTION OF DATABASE SCHEME MODELS FOR DATABASE SYSTEMS AND REST APIs

US9542622B2|2017-01-10|Framework for data extraction by examples

US7966356B2|2011-06-21|Apparatus and methods for displaying and determining dependency relationships among subsystems in a computer software system

US8479093B2|2013-07-02|Metamodel-based automatic report generation

US8438534B2|2013-05-07|Transformation of data between hierarchical data formats

US20050203869A1|2005-09-15|Hierarchical database apparatus, components selection method in hierarchical database, and components selection program

US8341191B2|2012-12-25|Methods and structures for utilizing reusable custom-defined nestable compound data types to permit product variations within an existing taxonomy

JP5147952B2|2013-02-20|System model conversion method, computer program, and system model conversion apparatus

US8510341B2|2013-08-13|System, method and structures for a reusable custom-defined nestable compound data type for construction of database objects

US20110161918A1|2011-06-30|Applying rules to data

US9495475B2|2016-11-15|Method of representing an XML schema definition and data within a relational database management system using a reusable custom-defined nestable compound data type

JP2007087216A|2007-04-05|Hierarchical dictionary preparing device, program and hierarchical dictionary preparing method

EP2107457B1|2011-10-12|Automatic software configuring system

US20100131565A1|2010-05-27|Method for creating a self-configuring database system using a reusable custom-defined nestable compound data type

US8732596B2|2014-05-20|Transformation of hierarchical data formats using graphical rules

US20050256695A1|2005-11-17|Creating visual data models by combining multiple inter-related model segments

US20110161917A1|2011-06-30|Processing collections of data items

US20010014899A1|2001-08-16|Structural documentation system

US20110161934A1|2011-06-30|Generating and monitoring data items

de la Vega et al.2020|Mortadelo: Automatic generation of NoSQL stores from platform-independent data models

JP5090481B2|2012-12-05|Data modeling method, apparatus and program

Wojszczyk et al.2017|The process of verifying the implementation of design patterns—used data models

US20050273721A1|2005-12-08|Data transformation system

US20060287977A1|2006-12-21|Method of processing data for a system model

JP2010039751A|2010-02-18|Software development system

同族专利:

公开号 | 公开日

BE1024144A1|2017-11-21|

EP3423957A1|2019-01-09|

US11100059B2|2021-08-24|

WO2017093576A1|2017-06-08|

US20190073388A1|2019-03-07|

US20210334250A1|2021-10-28|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US20100131565A1|2008-11-21|2010-05-27|Sap Ag|Method for creating a self-configuring database system using a reusable custom-defined nestable compound data type|

US20120005241A1|2010-06-30|2012-01-05|Ortel Jeffrey R|Automatically generating database schemas for multiple types of databases|

EP2827262A1|2013-07-18|2015-01-21|Ims Health Incorporated|System and method for modelling data|

US5596746A|1991-10-21|1997-01-21|General Electric Company|Method for transforming relational data base schemas into object models using ideal table meta models|

US5937410A|1997-10-16|1999-08-10|Johnson Controls Technology Company|Method of transforming graphical object diagrams to product data manager schema|

US6665677B1|1999-10-01|2003-12-16|Infoglide Corporation|System and method for transforming a relational database to a hierarchical database|

US20060173873A1|2000-03-03|2006-08-03|Michel Prompt|System and method for providing access to databases via directories and other hierarchical structures and interfaces|

US10311110B2|2015-12-28|2019-06-04|Sap Se|Semantics for document-oriented databases|US7752266B2|2001-10-11|2010-07-06|Ebay Inc.|System and method to facilitate translation of communications between entities over a network|

US8078505B2|2002-06-10|2011-12-13|Ebay Inc.|Method and system for automatically updating a seller application utilized in a network-based transaction facility|

US10783153B2|2017-06-30|2020-09-22|Cisco Technology, Inc.|Efficient internet protocol prefix match support on No-SQL and/or non-relational databases|

US11194798B2|2019-04-19|2021-12-07|International Business Machines Corporation|Automatic transformation of complex tables in documents into computer understandable structured format with mapped dependencies and providing schema-less query support for searching table data|

US10719490B1|2019-12-19|2020-07-21|Capital One Services, Llc|Forensic analysis using synthetic datasets|

US11086829B2|2020-01-02|2021-08-10|International Business Machines Corporation|Comparing schema definitions using sampling|

US10908593B1|2020-06-08|2021-02-02|Factory Os, Inc.|Systems and methods for fabrication of a volumetric module|

CN112114794B|2020-09-27|2021-11-09|深圳天玑数据有限公司|Automatic generation method and device of website application program and computer storage medium|

CN112100247B|2020-11-12|2021-02-02|耀方信息技术有限公司|Method and system for querying data by using ElasticSearch|

法律状态:
2018-02-12| FG| Patent granted|Effective date: 20171122 |

优先权:

申请号 | 申请日 | 专利标题

EP16158241|2016-03-02|

EP16191695.2|2016-09-30|

EP16191695|2016-09-30|

[返回顶部]