An Object-Relational Mapper (ORM) is a library that handles the conversion between domain models and database entities, and vice versa. While ORMs are designed to simplify this process, they aren't always efficient in every scenario. This can sometimes lead to performance bottlenecks and a subpar user experience. In this blog post, we’ll explore common challenges associated with ORMs and discuss strategies to overcome them.
ORMs and What They Are About
ORM libraries are responsible for handling a variety of tasks behind the scenes. They take care of translating the domain model and the entity model in the application, transactions, generating queries, or managing the connections. Due to their complexity, they can fail in numerous ways. Let’s see how.
Application and Domain Models
Mapping domain objects to ORM logic is usually done via annotations or attributes. Each field in the object is annotated with metadata that describes how it should be represented in the database, including column types, keys, and table relationships. These annotations can be placed directly on domain objects or on a separate entity model. Using a distinct entity model is recommended, as it decouples the domain model from the persistence infrastructure.
However, mapping domain models to entity models is not straightforward, a challenge known as the Object-Relational Impedance Mismatch. Common issues include:
- Incompatible columns: Some database types don't map neatly to object-oriented programming (OOP) types, and vice versa. For example, text data may have differences in encoding, collation (alphabetical order), or comparison rules (case sensitivity).
- 1-to-1 and 1-to-many relationships: These can cause complications with lazy and eager loading, often leading to inefficient queries or n+1 query issues.
- Inheritance: Representing inheritance hierarchies in a database can be done in several ways (e.g., table per class, table per concrete class, or table per hierarchy). Performance depends on the approach, as it affects how joins and filtering are handled.
Tables and Joins
Parsing query results in an ORM library is often more complex than it appears, especially when working with nested collections and related entities. Consider a scenario where data for an entity is distributed across multiple tables. To construct a complete business aggregate, a single query may need to join all these tables. This can lead to an exponentially growing number of rows in the result set, significantly complicating processing and potentially impacting performance.
Fast Queries
Generating an efficient query may be challenging. One example is the use of Common Table Expressions (CTEs). While they improve the readability of the code, they may result in longer execution times. Similarly, ORMs can deal with complex joins and subqueries, but they may result in barely readable code.
Transactions and Isolation Levels
ORMs are responsible for managing transactions and initiating and completing them as necessary. However, transactions operate under isolation levels, which can impact query results and overall performance.
Different databases use varying default isolation levels. MySQL uses REPEATABLE READ, PostgreSQL goes with READ COMMITTED, and DB2 defaults to Cursor Stability. Most ORMs adhere to the database's default isolation level, meaning application behavior can vary when switching between databases.
However, ORMs can override default isolation levels through configuration settings, which might be buried in documentation or vary between ORM versions. Some libraries may not use transactions at all, leading to potential inconsistencies. For example, with READ COMMITTED isolation, duplicate records can occasionally occur. These complexities highlight the need to understand both the database and ORM configuration to ensure consistent behavior across environments.
How to Deal with ORMs
To tame the complexity, we need to have proper observability in place. It’s not enough to use unit tests to check if the code does what we expect it to do. We need to check if it’s fast enough and if it will scale well when deployed to production.
To capture all of that, we need to have an observability solution that can capture queries, assess their performance, and predict how they will behave when facing increased load in production. Metis does all of that and is the only solution that can protect developers from deploying faulty code to production.
Summary
ORM libraries streamline many aspects of application development but also introduce additional layers of complexity. This can lead to performance issues and new challenges if not properly managed. Developers must understand the implications of using ORMs in their codebase. Tools like Metis can significantly enhance observability and aid in troubleshooting these issues.