Parquet Bundle: File-Level Legend Persistence

by Alex Johnson 46 views

This article explores the feature request for implementing file-level persistence for legend settings within .parquetbundle files. This enhancement addresses the current limitation where legend customizations are lost when the browser is closed or when the data file is shared. By embedding these settings directly into the .parquetbundle file, users can seamlessly share configured visualizations and maintain their customizations across different sessions and with collaborators. Let's dive into the details of this valuable feature.

The Problem: Ephemeral Legend Customizations

Currently, users can customize legends within a browser session, thanks to the previously implemented browser-level persistence. However, these customizations are not persistent. The key problem this feature addresses is the loss of these customizations under two main scenarios:

  • When the browser is closed, all legend settings are reset to default.
  • When a .parquetbundle file is shared with a colleague, they do not see the same legend configurations.

This lack of persistence hinders collaboration and requires users to repeatedly reconfigure legends, leading to a less efficient workflow. Therefore, a solution is needed to save these visualization customizations permanently and make them easily shareable.

Proposed Solution: Embedding Settings in .parquetbundle

The proposed solution involves extending the existing legend settings persistence to support saving and loading customizations directly from .parquetbundle files. This approach ensures that all necessary visualization settings are bundled with the data, allowing users to share their configured visualizations seamlessly and restore their work across different sessions. This will involve modifying the file structure to accommodate these settings.

To achieve this, the following settings need to be persisted within the file, per feature:

  • legendOrder: An array defining the display order of category names.
  • shapeSize: The point size used for visualizing the feature.
  • maxLegendItems: The maximum number of items displayed before categories are grouped into an "Other" category.
  • extractedItems: A list of categories extracted from the "Other" category.
  • toggledLabels: A map indicating the visibility state of each category.

Functional Requirements

Several functional requirements must be met to ensure the successful implementation of this feature:

  • FR-1: The system must be able to add the legend settings to the .parquetbundle file structure during export or download.
  • FR-2: The system must be able to read and apply the settings from a .parquetbundle file during import or upload.
  • FR-3: The system must gracefully handle .parquetbundle files that do not contain any settings, using default values in such cases.
  • FR-4: Users should have the option to export .parquetbundle files with or without the custom legend settings.

These requirements ensure that the feature is both functional and user-friendly.

User Interface Changes

To support this new feature, the user interface will require a few key changes:

  • UI-1: An "Export Parquet" option should be available.
  • UI-2: An "Export Parquet with Settings" option should be added, providing users with control over whether to include the customizations.
  • UI-3: The interface should clearly indicate when a loaded file contains custom settings, so users are aware that their visualization has been pre-configured.

These UI enhancements will make the feature intuitive and easy to use.

Acceptance Criteria

To ensure that the implemented solution meets the desired standards, the following acceptance criteria must be satisfied:

  • An exported .parquetbundle file should correctly contain the legend settings.
  • Loading a .parquetbundle file with settings should automatically apply those settings to the visualization.
  • Settings should persist through a full round-trip: configure settings, export the file, close and reopen the browser, open the file, and verify that the settings are restored.
  • The system should gracefully handle invalid or corrupted settings files, falling back to default settings and logging a warning message.
  • The increase in file size due to the added settings should be minimal (typically less than 5KB for the settings JSON).
  • The export dialog or option should clearly indicate whether settings will be included in the exported file.

Testing Scenarios

To thoroughly test the new feature, the following testing scenarios should be considered:

  1. Round-trip persistence: Configure the legend settings, export the .parquetbundle file, close the browser, reopen the browser, open the exported file, and verify that the settings are restored correctly.
  2. Importing: Load .parquetbundle files both with and without legend settings to ensure that the settings are either loaded correctly or default settings are applied as expected.
  3. Corrupted settings: Load a .parquetbundle file containing invalid JSON in the settings to verify that the system warns the user and uses default settings.
  4. Partial settings: Load a .parquetbundle file with settings for some features only to ensure that a mix of default and custom settings is correctly applied.
  5. Export options: Export the same data with and without settings to confirm that the resulting files differ appropriately.

Migration Considerations

When implementing this feature, it's important to consider the following migration aspects:

  • Existing .parquetbundle files should remain fully compatible with the new version.
  • Initially, no changes are required to the Python backend, as this is primarily a web-based feature.
  • In the future, the Python backend can be updated to also read and write the visualization_settings.json file.

Alternatives Considered

Currently, no alternative solutions have been considered for this feature request.

Additional Context

No additional context has been provided for this feature request.

Conclusion

Implementing file-level persistence for legend settings in .parquetbundle files will significantly enhance the user experience by allowing for seamless sharing of configured visualizations and ensuring that customizations are maintained across sessions. This feature will promote collaboration, save time, and improve overall productivity. By adhering to the outlined requirements, acceptance criteria, and testing scenarios, this enhancement will be a valuable addition to the platform.

Learn more about Parquet file format on the Apache website.