Iceberg Examples

A collection of examples demonstrating the Apache Iceberg Java API for table format operations, schema evolution, and data management.

Overview

Apache Iceberg is an open table format for huge analytic datasets. This project provides practical examples of using Iceberg's Java API to:

Create and manage table catalogs
Define and evolve table schemas
Perform data operations
Understand Iceberg's core concepts

Prerequisites

Java 11 or higher
Gradle 8.5 or higher (included via wrapper)

Getting Started

Build the Project

./gradlew build

Run Examples

Run the main examples class:

./gradlew run

Run specific example classes:

# Data operations example
./gradlew runDataOperations

# Schema evolution example
./gradlew runSchemaEvolution

Run Tests

./gradlew test

Examples Included

1. Basic Iceberg Operations (`IcebergExamples.java`)

Schema definition and creation
Iceberg data type examples
Schema field inspection
Understanding schema structure and properties

2. Data Operations (`DataOperationsExample.java`)

Creating sample records
Working with GenericRecord API
Understanding data writing concepts
Table information display

3. Schema Evolution (`SchemaEvolutionExample.java`)

Adding new columns
Renaming existing columns
Updating column types (safe operations)
Deleting columns
Schema compatibility rules

Project Structure

src/
├── main/
│   └── java/
│       └── com/
│           └── example/
│               └── iceberg/
│                   ├── IcebergExamples.java           # Main examples class
│                   ├── DataOperationsExample.java     # Data operations
│                   └── SchemaEvolutionExample.java    # Schema evolution
└── test/
    └── java/
        └── com/
            └── example/
                └── iceberg/
                    └── IcebergExamplesTest.java        # Unit tests

Key Dependencies

Apache Iceberg Core: Table format and core API functionality
Apache Iceberg API: Public API interfaces and types
SLF4J: Logging framework
JUnit 5: Testing framework

Note: These examples focus on demonstrating Iceberg's schema and data type APIs. For production table operations, you would typically add catalog implementations (Hadoop, Hive, REST, etc.) and file format dependencies (Parquet, ORC, etc.).

Next Steps

These examples demonstrate Iceberg's core schema and data APIs. For complete table operations, consider:

Adding Catalog Support

Hadoop Catalog: For HDFS or local filesystem
Hive Metastore: For integration with existing Hive setups
REST Catalog: For modern cloud-native deployments
JDBC Catalog: For SQL-based metadata storage

Adding File Format Support

Parquet: Most common format for analytics workloads
ORC: Alternative columnar format
Avro: For schema evolution scenarios

Storage Integration

HDFS: For on-premises Hadoop clusters
Amazon S3: For AWS environments
Azure Data Lake Storage: For Azure environments
Google Cloud Storage: For GCP environments

Learning Resources

Important Notes

Scope of Examples: These examples focus on demonstrating Iceberg's schema and data type APIs without requiring external infrastructure. They are educational examples showing:

How to define and work with schemas
Iceberg's rich type system
Schema evolution concepts and rules
Record creation and manipulation

For Production Use: To build complete Iceberg applications, you'll need to add:

Catalog implementations for metadata storage
File format dependencies (Parquet, ORC, etc.)
Storage system integration (HDFS, S3, etc.)
Compute engine integration (Spark, Flink, Trino)

Data Operations: The data examples show record creation and structure but don't perform actual table I/O. For production data operations, use:

Iceberg's AppendFiles API for direct writes
Compute engines like Apache Spark, Apache Flink, or Trino
Proper transaction and commit handling

License

This project is licensed under the same terms as the Apache Iceberg project.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
gradle/wrapper		gradle/wrapper
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Iceberg Examples

Overview

Prerequisites

Getting Started

Build the Project

Run Examples

Run Tests

Examples Included

1. Basic Iceberg Operations (`IcebergExamples.java`)

2. Data Operations (`DataOperationsExample.java`)

3. Schema Evolution (`SchemaEvolutionExample.java`)

Project Structure

Key Dependencies

Next Steps

Adding Catalog Support

Adding File Format Support

Storage Integration

Learning Resources

Important Notes

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

License

Uh oh!

manuzhang/iceberg-examples

Folders and files

Latest commit

History

Repository files navigation

Iceberg Examples

Overview

Prerequisites

Getting Started

Build the Project

Run Examples

Run Tests

Examples Included

1. Basic Iceberg Operations (IcebergExamples.java)

2. Data Operations (DataOperationsExample.java)

3. Schema Evolution (SchemaEvolutionExample.java)

Project Structure

Key Dependencies

Next Steps

Adding Catalog Support

Adding File Format Support

Storage Integration

Learning Resources

Important Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

1. Basic Iceberg Operations (`IcebergExamples.java`)

2. Data Operations (`DataOperationsExample.java`)

3. Schema Evolution (`SchemaEvolutionExample.java`)

Packages