Database Review Guide
Database
Deletion is only allowed when there are no tables in the database, which can greatly avoid accidental deletion. Suggested error level: Error
System
The character set determines which characters can be stored in the table. Using the wrong character set may result in certain characters in the application being unable to be stored and displayed correctly, such as CJK and Emoji. Suggested error level: Error
Allow list: utf8mb4The collation determines the rules for character comparison and sorting. For example, when using a case-insensitive collation, "ABC" and "abc" will be treated as the same string in queries. Suggested error level: Error
Allow list: utf8mb4_0900_ai_ci- Maximum comment length: 64
Column
Some common columns are helpful for better application maintenance. For example, adding a business-independent "ID" column as the primary key avoids primary key conflicts caused by business changes (such as business mergers), and in some scenarios can also bring better data insertion performance. Suggested error level: Warning
Required column names: id, created_ts, updated_ts, creator_id, updater_idSome common columns are helpful for better application maintenance. For example, adding a business-independent "ID" column as the primary key avoids primary key conflicts caused by business changes (such as business mergers), and in some scenarios can also bring better data insertion performance. Suggested error level: Warning
Required column names: IDSome common columns are helpful for better application maintenance. For example, adding a business-independent "ID" column as the primary key avoids primary key conflicts caused by business changes (such as business mergers), and in some scenarios can also bring better data insertion performance. Suggested error level: Warning
Required column names: IDAbusing column types can have serious negative effects on system maintainability and performance. For example, using "LOB" column to store large amounts of audio and video data may cause database performance to decrease, backup and recovery times to lengthen, and data synchronization tools incompatible. Suggested error level: Error
Disallow list: JSONColumns cannot have NULL value.
Modifying column types may affect system performance, maintainability, and even lead to data loss. Suggested error level: Warning
For a 'NOT NULL' column, if a value is not assigned to the column when inserting a new row and the column does not have a default value, the database will reject the insertion of that row. Setting a default value for a new column can also ensure compatibility with legacy application. Suggested error level: Error
"CHANGE COLUMN" is unique to MySQL syntax and can be used to modify column names and other properties at the same time. However, it may cause the column name to be mistakenly changed when modifying properties. It is recommended to still use standard "RENAME" and "MODIFY" statements to distinguish between the two types of changes. Suggested error level: Error
Modifying the order of columns may cause some applications or views that depend on the default order of the original table to produce unexpected results, such as "select *". Suggested error level: Warning
MySQL's auto-increment column are generally used as business-independent primary key. Using integer types occupies less storage space and makes the primary key index structure more compact, bringing better query and DML performance. Suggested error level: Error
It is recommended to set the charset at the database level or table level. Setting the charset at finer granularity can bring unnecessary complexities. Suggested error level: Error.
Unsigned types do not store negative numbers, and the range of values that can be stored by the same type is doubled, which can avoid auto-increment columns overflow. Suggested error level: Warning
Adding comments to columns is a good development practice, but excessively long comments can decrease the readability of the schema.Suggested error level: Warning
Require comment: trueMax length: 64"CHAR" is a fixed-length type. For example, the CHAR(20) column will occupy 20 character spaces even if only one character is stored, causing waste. When the string is too long and the length is not fixed, consider using VARCHAR for MySQL and using TEXT for PostgreSQL. Suggestion error level: Error
Maximum length: 20- Maximum length: 2560
based on management requirements to limit the initial value of the auto-increment column. Suggestion error level: Warning
Initial value: 1Only columns recording the creation time of the record with "DEFAULT NOW()" and recording the update time of the record with "DEFAULT NOW() ON UPDATE" need to call function to get system time. It is meaningless and will increase resource overhead to record system time in other columns. Suggestion error level: Error
Setting default values that satisfy business logic can effectively improve the data quality of downstream analytical pipeline. This rule does not check "PRIMARY KEY", "JSON", "BLOB", "TEXT", "GEOMETRY", "AUTO_INCREMENT", "GENERATED" types. Suggestion error level: Warning
Schema
Some changes may affect running applications, such as modifying the name of database object, adding new constraints, etc. This rule can avoid careless changes that lead to the failure of existing application. Suggestion error level: Warning
Table
In addition to carry business meaning, primary key are also beneficial for high-concurrency queries in MySQL. Various data synchronization, comparison, and rollback tools often require tables to have primary key. Suggestion error level: Error
The advantages and disadvantages of foreign key are highly controversial. Using foreign key may significantly increase the difficulty of database changes, scalability (such as sharding), etc. And may even prevent the use of some tools. Therefore, another option is to implement foreign key constraints at the application layer. Suggestion error level: Warning
For example, by requring the "_del" suffix, it can effectively prevent accidental deletions. Suggestion error level: Error
Table name format (regex): _del$In some database engines, partitioned tables are not mature, and the use and maintenance are inconvenient. Therefore, it is more inclined to use manual data partitioning methods such as database and table sharding. Suggestion error level: Warning
Configure whether the table requires comments and the maximum comment length.
Require comment: trueMax length: 64
Statement
SELECT * to fetch entire row data may cause unnecessary resource overhead and may also cause unexpected results in applications once the table adds or removes columns. Suggestion error level: Error
Queries without WHERE clause may cause huge uncessary resource overhead, and DMLs may cause massive accidental data loss. Suggestion error level: Error
When using leading wildcard, such as "LIKE '%ABC'", the database optimizer cannot use fast index scan, and fallback to full table scan or full index scan, which may cause serious performance impact. Suggestion error level: Error
In some cases, multiple statements are required to be included in a transaction committed by the system, in order to quickly rerun in case of partial failure. Therefore, explicit "COMMIT" is not allowed. Suggestion error level: Warning
If LIMIT is used in DML statements without an ORDER BY clause, the affected rows order are not fixed, which may cause data inconsistency between the primary and replica databases in some replication modes. Suggestion error level: Error
Sorting operations are extremely resource-intensive, so for update and delete operations, it is recommended to use a deterministic filtering condition as much as possible instead of using ORDER BY and LIMIT. Suggestion error level: Error
Every change to a table may cause a table-level lock and consume a large amount of resources. If there are multiple changes to the same table, they should be merged into a single change statement. Suggestion error level: Error
The "INSERT INTO table VALUES (...)" statement does not explicit list column names. Once the column order changes or columns are added or dropped, the statement may faile or generate unexpected data. Suggestion error level: Error
Randomly sorting the data to be inserted is meaningless and will only consume uncessary resources. Suggestion error level: Error
Reveal the number of rows to be inserted can help determine whether the statement meets business expectations. Suggestion error level: Warning
Maximum insert amount: 1000Reveal the number of rows to be updated or deleted can help determine whether the statement meets business expectations. Suggestion error level: Warning
Maximum affected rows: 1000When the syntax is correct, but the table name is incorrect or the permission is insufficient, it can be discovered by dry run before the actual execution. Suggestion error level: Warning
Before PostgreSQL 11, adding a column with a default value cause table locking and unable to read and write, which may cause business interruption. In PostgreSQL 11 and above, this issue has been optimized and there is no need to pay attention to this rule. Suggestion error level: Warning
Before PostgreSQL 11, adding a CHECK constraint need to verify the existing data and lock the table. This blocks read and write, which may cause business interruption. It is recommended to add the "NOT VALID" option to validate new data and manually validate existing data after the change is completed. In PostgreSQL 11 and above, this issue has been optimized and there is no need to pay attention to this rule. Suggestion error level: Warning
Before PostgreSQL 11, adding a NOT NULL constraint need to verify the existing data. This blocks read and write, which may cause business interruption. In PostgreSQL 11 and above, this issue has been optimized and there is no need to pay attention to this specification. Suggestion error level: Warning
Naming
The default format is all lowercase letters, separated by underscores between words, and no more than 63 characters long, such as "abc" and "abc_def". Suggestion error level: Warning
Table name format (regex): ^[a-z]+(_[a-z]+)*$Length limit: 63The default format is all lowercase letters, separated by underscores between words, which is no more than 63 characters long, such as "abc" and "abc_def". Suggestion error level: Warning
Column name format (regex): ^[a-z]+(_[a-z]+)*$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "uk_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "uk_my_table_id_name". Suggestion error level: Warning
Unique key name format: ^$|^uk_{{table}}_{{column_list}}$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "pk_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "pk_my_table_id_name". Suggestion error level: Warning
Primary key name format: ^$|^pk_{{table}}_{{column_list}}$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "idx_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "idx_my_table_id_name". Suggestion error level: Warning
Index name format: ^$|^idx_{{table}}_{{column_list}}$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "fk_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "fk_my_table_id_name". Suggestion error level: Warning
Foreign key name format: ^$|^fk_{{referencing_table}}_{{referencing_column}}_{{referenced_table}}_{{referenced_column}}$Length limit: 63The default column name is "ID", and is no more than 63 characters long.
Auto-increment column name format (regex): ^id$Length limit: 63- Upper case: true
Index
Creating an index with duplicate columns will result in failure. Suggestion error level: Error
The "BLOB" type is usually used to store binary data and should not be used as a query condition. If an index is created on this column type by mistake, it will consume a lot of resources and cause serious performance impact. Suggestion error level: Error
Enforce the primary key type to be INT or BIGINT.
A composite index with over 5 columns does not significantly improve query performance, but it occupies a lot of space and reduces DML performance. Suggestion error level: Warning
Maximum column count: 5Although indexes can improve query performance, they also occupy a lot of space and reduce DML performance. Therefore, it is not recommended to create more than 5 indexes in a table. Suggestion error level: Warning
Maximum index count: 5The appropriate primary key type can optimize storage structure, reduce space usage, and beneficial for insert and query performance. Suggestion error level: Warning
Allow list: SERIAL, BIGSERIAL, INT, BIGINTIn PostgreSQL 11 and above, using the standard statement to create an index will cause table locking and unable to write. Using the "CONCURRENTLY" mode can avoid this problem. Suggestion error level: Warning
Engine
InnoDB is the default storage engine for MySQL that provides transaction support. It also provides better performance for high-concurrency and low-latency scenarios, and supports online data backup and recovery. It is the preferred choice for OLTP businesses. Suggestion error level: Error