Replies: 4 comments 3 replies
-
Thank you for putting all of this information together. Is there any sort of consensus on a v1.0 release target, future release cadence, etc? This is really exciting stuff. |
Beta Was this translation helpful? Give feedback.
-
Thanks for summing up the features. It's great to give users visibility on the planned topics. Most of these features haven't been discussed yet. We have to discuss first, then get a consensus and then implement and finish it. I'd rather see the only list of features and some ordering within that list. Especially when we have mostly automated releases, the release cadence can (and should IMHO) be quite high - definitely much higher than "every three months". This will render "target version numbers" or related milestones invalid. This also raises the question of how many major/minor versions shall be maintained or whether there's just one version being maintained. To when a feature is finally merged into "main" and released: I'm a big fan of "it's ready when it's ready" and not tie anything to any particular date. Deadline-Driven-Development leads to compromises, to tech-debt, to unnecessary issues and eventually more work than necessary. |
Beta Was this translation helpful? Give feedback.
-
I added S3 Request Signing to the roadmap table, with best effort to target 1.0.0 release. |
Beta Was this translation helpful? Give feedback.
-
I updated the document to use |
Beta Was this translation helpful? Give feedback.
-
Over the past months, we've collaborated with a wide range of stakeholders—companies, developers, and users—who are invested in the evolution of Apache Polaris. This roadmap consolidates those insights into a shared vision, ensuring that our efforts address the most impactful and widely supported improvements. We appreciate the valuable feedback and collaboration that have shaped this direction.
The roadmap Items can be broadly classified into several categories such as
Feature Proposal List
*This is a tentative proposed version with these features.
Core Polaris Functions
Foreign Tables and Delta Format Support
Polaris enables support for non-Iceberg table formats through the concept of Foreign Tables. These tables behave similarly to regular tables but include an additional format attribute that defines the table format type. This approach not only opens up flexibility to support additional formats but also sets the stage for enhancing Polaris’ capabilities.
For example, while Polaris currently supports the Iceberg REST catalog for managing and querying large datasets, incorporating support for additional formats—such as Delta—would further extend its capabilities. Delta format support would allow for enhanced governance, compliance, data management, disaster recovery, and migrations within the same Catalog. This enhancement involves generating Iceberg metadata to read Delta tables and enabling both Delta read and write operations from engines like Apache Spark.
Milestone: 1.1
Policy Store
Apache Polaris' support for a Policy Store allows it to serve as a centralized repository for all policies related to data assets, ensuring consistent governance and compliance across the organization. This includes policies for table maintenance, access control, data security, and overall data governance, enabling administrators to easily enforce, track, and audit these policies. By consolidating policy management in Polaris, organizations can streamline their data management processes while maintaining compliance and security standards.
More details here Policy Management in Apache Polaris
Milestone 1.0
Table Maintenance Framework
Table Maintenance Framework brings capabilities to store Table maintenance policies, properties, statistics, and events necessary for performing Table maintenance and Optimizations. This does not include actual Table maintenance operations that need to run a compute infrastructure. More details here Table Maintenance in Polaris
Milestone: 1.1
SQL and NoSQL Persistence
Enable SQL (ex. Postgres) and NoSQL (ex. DynamoDB, Cassandra, etc) persistence storage backends for Polaris. More details hereApache Polaris (incubating) - SQL/NoSQL persistence backend support)
Milestone: 1.0
S3-compatible storage support
Support the s3-compatible storage, such as MinIO, Ceph, Dell ECS. More details are here, #389.
Milestone: 1.0
Catalog Browser experience (UI)
User Experience and Interface for Apache Polaris. Enable users to browse catalogs, databases and tables. Provides basic operations on governance, policy management, and other governance functions. #572 is the corresponding issue.
Milestone: 1.5+
Catalog Federation and Integrations
Catalog Federation
Enable federation of reads and writes to any remote catalog thus making Apache Polaris a Catalog of Catalogs. This primarily includes catalogs that support IRC and Hive protocols. Some details here Polaris Roadmap and Catalog Federation Diagrams
Milestone: 1.1
Catalog Migrator
Users may want to move Iceberg Tables from several Catalog solutions into Apache Polaris. Catalog migrator enables migration of tables registered in catalogs such as Glue, Hive, or other Iceberg Rest Catalogs into Apache Polaris.
Milestone: 1.0
Data Security, Data Governance and Compliance
Governance Policies for Tables
Polaris will provide the ability to define access policies and other governance policies (such Retention) by Tables. More details here Policy Management in Apache Polaris
Milestone: 1.2
Column level and Row level Policies
Provides the capability to define and enforce column level and row level access and other governance policies. More details here Policy Management in Apache Polaris
Milestone: 1.2
Identity federation, SCIM, SSO and OAuth support
Supporting SCIM and SAML is essential for efficient user provisioning, seamless access management, and enhanced security, ensuring that users can securely access and manage data resources while complying with organizational policies. This also enable easy identity federation and OAuth federation to third party identity providers. More details here Adding Federated User and Role Support in Polaris
Milestone: 1.0
Audit and Events Interface
Enable audit logs and history for Catalog, Database, Table, Property and Policy changes through events interface. Initial spec details here Polaris Event Listeners
Milestone 1.2
Data Lineage
Data Lineage functionality allows users to trace the flow of data across different systems and tables providing visibility into its origin, origins and usage. This feature enhances data governance auditability and troubleshooting by visually representing data’s lifecycle from source to destination. This includes Table and Column lineages.
Milestone 1.5+
Data Tagging and Classification
Enable categorizing and labeling data assets based on predefined criteria, such as data type, sensitivity, or usage. This helps organizations efficiently organize, search, and secure their data by assigning meaningful tags and classifications, enabling better governance and compliance management. Through this feature, users can quickly locate relevant data and ensure appropriate access controls are in place
Milestone 1.5+
Encryption Support
Enable support for encrypted Iceberg tables by managing Key Management Service (KMS) integrations. Facilitate the vending of encryption keys, ensuring seamless key retrieval and rotation for standard KMS solutions
Milestone 1.3
Observability, Telemetry and Reliability
Data Lake Operational Metrics
Enable operational metrics on Catalog, Databases and Tables to enable operational manageability of the Data Lake. This includes
Milestone 1.5+
Data Health Monitoring and Alerts
Enable capabilities to monitor and alert on health of the data including
Milestone 1.5+
AI/ML
Volumes/Directory Tables
A table-like entity like volumes can be used for organizing and managing unstructured data. Volumes provide a way to group related data files logically, similar to directories or containers. More details here Unstructured Data Support in Polaris
Milestone 1.5+
Beta Was this translation helpful? Give feedback.
All reactions