MHR API: Boosting Account Registrations Performance

by Alex Johnson 52 views

Unpacking the Challenge: Why MHR Account Registrations are Slow

The MHR API plays a crucial role for BC Government staff, especially when managing account registrations. However, we've hit a bit of a snag: the performance of the /mhr/api/v1/registrations endpoint has been experiencing significant performance degradation, particularly for staff members who need to view all registrations without applying any specific filters. Imagine trying to get a complete overview, and your system takes an eternity! This isn't just an inconvenience; it directly impacts BC Gov staff efficiency and, ultimately, the service provided to citizens.

At the heart of the issue lies the growing complexity of the MHR account registrations table. Over time, as new properties and data points were added to meet evolving requirements, the underlying database queries became more cumbersome. What was once a swift operation has now slowed down considerably. We're talking about queries that take anywhere from 7 to 10 seconds in production environments. To put that in perspective, that's like waiting for a web page to load for almost a quarter of a minute just to see a list of registrations. When you break it down, even retrieving basic properties alone can take around 3.1 seconds, and adding most other details (excluding owner names) pushes that time to a staggering 5-6 seconds. This kind of delay can really add up throughout a workday, leading to frustration and reduced productivity for our dedicated staff.

The problem isn't with the initial subquery that identifies which MHR numbers to include; that part is still acceptably fast, clocking in at around 0.1 seconds. The real bottleneck emerges when the system tries to pull together all the derived information from various sources and complex joins to present a complete picture of each registration. This reliance on dynamically calculated values from a database view, especially without specific filters to narrow down the data, means the system has to do a lot of heavy lifting every single time. It's a classic case where what initially seemed like a flexible database design has, over time, transformed into a performance bottleneck as data volume and query complexity increased. Understanding this root cause is the first critical step toward implementing a robust and lasting solution that will improve the MHR API performance for everyone involved.

Diving Deep into the Performance Bottleneck: The mhr_account_reg_vw View

To truly understand why our MHR API endpoint for account registrations is slowing down, we need to take a closer look at the mhr_account_reg_vw – that's our database view. Think of a database view as a virtual table that presents data from one or more actual tables, often simplifying complex queries. While views are incredibly useful for abstracting complexity and enhancing security, they can also become a performance bottleneck if not carefully managed, especially when they involve extensive joins, calculations, or aggregations on large datasets. In our case, this view is where much of the derived registration summary data is computed on the fly every time a request comes in.

The core issue stems from the fact that the view dynamically calculates and aggregates several key properties for each registration whenever the /mhr/api/v1/registrations endpoint is accessed without filters. These properties, while essential for staff, are computationally intensive to gather. We're talking about values like submitting_name, which identifies who submitted the registration; owner_names, which lists all the owners associated with an MHR number; and registering_name, indicating the registering party. Each of these often requires navigating through multiple related tables to stitch together the correct information. Imagine the database performing several mini-searches and calculations for every single registration record in the system – that’s a lot of work!

Beyond just names, other properties like last_doc_type (the type of the most recent document), note_status and note_expiry (details about any associated notes), cancel_doc_type (if the registration was cancelled), frozen_doc_type (if it’s in a frozen state), ppr_lien_type (related lien information), location_type (where the manufactured home is located), and civic_address (its physical address) all contribute to this overhead. Even the staff_account_id can add to the load if its retrieval involves complex lookups. When a staff member accesses the endpoint with no filters, the view has to process all of these derived properties for potentially thousands of MHR numbers, leading to the observed 7-10 second delay. The acceptable 0.1-second subquery to get the MHR numbers is quickly overshadowed by the extensive work the view then has to do to enrich each of those numbers with all the necessary details. This constant, on-demand computation of summary data is precisely what makes the current data retrieval process inefficient and ripe for MHR API optimization.

The Strategic Solution: Denormalization and Caching for MHR Data

Now that we've pinpointed the core of the performance bottleneck within the MHR API's account registrations endpoint, it's time to talk about a strategic solution. The key to dramatically improving data retrieval speeds is to implement a combination of denormalization and strategic caching. Instead of recalculating all the essential registration summary data every single time a request is made, we can proactively store this derived information in a more accessible format, ready for quick retrieval.

Our proposed solution involves creating either new database columns or an entirely new table specifically designed to hold this pre-computed summary data. Denormalization, in simple terms, means adding redundant data or grouping data in a way that speeds up read operations, even if it means using a bit more storage space or introducing a slight overhead during writes. For read-heavy operations like viewing a list of registrations, this trade-off is often well worth it. By storing these derived properties directly, we eliminate the need for the complex joins and calculations that currently plague the mhr_account_reg_vw view. This means when staff query the /mhr/api/v1/registrations endpoint, the database can simply fetch the already-prepared summary information rather than embarking on a time-consuming computation journey.

The specific properties identified as critical for this pre-computation include submitting_name, owner_names, registering_name, last_doc_type, note_status, note_expiry, cancel_doc_type, frozen_doc_type, ppr_lien_type, location_type, civic_address, and staff_account_id. These are precisely the pieces of information that, when derived from multiple tables, contribute heavily to the current slowdown. By making them readily available, we can significantly boost the MHR API optimization efforts. For instance, owner_names, which often involves multiple joins and string concatenation, can be pre-calculated and stored as a single text field. Similarly, the various status indicators and date fields can be maintained directly in this new structure, ensuring consistent and rapid access. This approach leverages the power of optimized data storage for faster read performance, fundamentally changing how the MHR API interacts with the underlying database to present MHR account registrations data.

Implementing the Solution: Real-time Updates and API Integration

Implementing this strategic solution involves more than just creating new database columns or a new table; it requires a robust mechanism for maintaining data integrity and seamlessly integrating with the existing MHR API. The core of the implementation will revolve around ensuring that the new registration summary data is always up-to-date for every MHR number. This means adopting an event-driven approach: whenever a registration is created, updated, or modified in any way, the corresponding summary data in our new structure must be refreshed immediately.

This update process should occur at the end of every relevant transaction. For example, after a new registration is successfully recorded, or an existing one has its ownership, location, or status updated, a trigger or an event handler would fire. This process would then gather all the necessary derived properties (submitting_name, owner_names, registering_name, etc.) and write them to the designated new columns or the new summary table. This ensures that the pre-computed data always reflects the latest state of the MHR registration, maintaining crucial data consistency without manual intervention. It's like having a dedicated assistant constantly compiling the latest headlines, so when you ask for them, they're instantly available.

An intriguing possibility, and one that offers significant advantages, is to store all this summary information as a JSON data type within a single column. This approach has several benefits. Firstly, it allows us to store the data in a format that directly matches the schema returned by the /mhr/api/v1/registrations endpoint. This