Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
77590ed
Add index directives for database and DataJoint terms throughout the …
claude Dec 12, 2025
7a9c568
Add .gitignore for build artifacts and dependencies
claude Dec 12, 2025
ba0b4a1
Merge pull request #18 from dimitri-yatsenko/claude/fix-glossary-inde…
dimitri-yatsenko Dec 12, 2025
72db81e
clean up
dimitri-yatsenko Dec 12, 2025
5c2da11
Rename index.md to genindex.md to avoid slug conflict
claude Dec 12, 2025
db7c703
Remove index functionality
claude Dec 12, 2025
a990e0d
Merge pull request #19 from dimitri-yatsenko/claude/fix-glossary-inde…
dimitri-yatsenko Dec 12, 2025
8d4d9ab
Merge branch 'main' of github.com:datajoint/datajoint-book
dimitri-yatsenko Dec 12, 2025
695426b
Move Data Integrity chapter to Concepts section
claude Dec 12, 2025
e301359
Merge pull request #20 from dimitri-yatsenko/claude/reorganize-data-i…
dimitri-yatsenko Dec 12, 2025
184f608
Expand Master-Part Relationship chapter with ACID transactions and ex…
claude Dec 12, 2025
02a012e
Harmonize Computation chapter with Master-Part chapter
claude Dec 12, 2025
a520791
Rename Blob Detection chapter and update references
claude Dec 12, 2025
3013eaf
Rename Master-Part Relationships chapter to Master-Part
claude Dec 12, 2025
0d7946a
Move Caching chapter to Special Topics section
claude Dec 12, 2025
fd28e5b
Merge pull request #21 from dimitri-yatsenko/claude/master-part-relat…
dimitri-yatsenko Dec 12, 2025
3b832c5
Reorganize documentation structure
claude Dec 12, 2025
04e6200
Merge pull request #22 from dimitri-yatsenko/claude/reorganize-docs-s…
dimitri-yatsenko Dec 12, 2025
69c5703
Merge Computations section into Operations
claude Dec 12, 2025
ccdac43
Merge pull request #23 from dimitri-yatsenko/claude/merge-computation…
dimitri-yatsenko Dec 12, 2025
0ce5301
Document the anatomy of a make function
claude Dec 12, 2025
f6d93a0
Add dedicated chapter for make method anatomy
claude Dec 12, 2025
bcb19d5
Merge pull request #24 from dimitri-yatsenko/claude/document-make-fun…
dimitri-yatsenko Dec 12, 2025
5abba57
Complete Queries section with all query operators
claude Dec 12, 2025
2739109
Merge pull request #25 from dimitri-yatsenko/claude/complete-queries-…
dimitri-yatsenko Dec 12, 2025
6d3bffd
Combine and simplify query chapters
claude Dec 12, 2025
0dce78e
Merge pull request #26 from dimitri-yatsenko/claude/simplify-query-ch…
dimitri-yatsenko Dec 12, 2025
24e6262
Merge subqueries chapter into Restriction and rename Query Operators
claude Dec 12, 2025
cc83926
Merge pull request #27 from dimitri-yatsenko/claude/merge-subquery-re…
dimitri-yatsenko Dec 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Build output
_build/

# Node.js dependencies
node_modules/

# Python
__pycache__/
*.py[cod]
.ipynb_checkpoints/

# Environment
.env
.venv/
venv/

# IDE
.vscode/
.idea/

# OS files
.DS_Store
Thumbs.db
184 changes: 0 additions & 184 deletions SIMPLIFICATION_RECOMMENDATIONS.md

This file was deleted.

2 changes: 1 addition & 1 deletion book/00-introduction/05-executive-summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Unlike Entity-Relationship modeling that requires translation to SQL, DataJoint
Foreign keys in DataJoint do more than enforce referential integrity—they encode computational dependencies. A computed result that references raw data will be automatically deleted if that raw data is removed, preventing stale or orphaned results. This maintains *computational validity*, not just *referential integrity*.

**Declarative Computation**
Computations are defined declaratively through `make()` methods attached to table definitions. The `populate()` operation identifies all missing results and executes computations in dependency order. Parallelization, error handling, and job distribution are handled automatically.
Computations are defined declaratively through make() methods attached to table definitions. The populate() operation identifies all missing results and executes computations in dependency order. Parallelization, error handling, and job distribution are handled automatically.

**Immutability by Design**
Computed results are immutable. Correcting upstream data requires deleting dependent results and recomputing—ensuring the database always represents a consistent computational state. This naturally provides complete provenance: every result can be traced to its source data and the exact code that produced it.
Expand Down
8 changes: 4 additions & 4 deletions book/20-concepts/00-databases.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Databases are crucial for the smooth and organized operation of various entities
## Database Management Systems (DBMS)

```{card} Database Management System
A Database Management System is a software system that serves as the computational engine powering a database.
A Database Management System (DBMS) is a software system that serves as the computational engine powering a database.
It defines and enforces the structure of the data, ensuring that the organization's rules are consistently applied.
A DBMS manages data storage and efficiently executes data updates and queries while safeguarding the data's structure and integrity, particularly in environments with multiple concurrent users.

Expand All @@ -50,7 +50,7 @@ One of the most critical features distinguishing databases from simple file stor

### Authentication and Authorization

Before you can work with a database, you must **authenticate**—prove your identity with a username and password. Once authenticated, the database enforces **authorization** rules that determine what you can do:
Before you can work with a database, you must **authentication**—prove your identity with a username and password. Once authenticated, the database enforces **authorization** rules that determine what you can do:

- **Read**: View specific tables or columns
- **Write**: Add new data to certain tables
Expand Down Expand Up @@ -80,10 +80,10 @@ Modern databases typically separate data management from data use through distin

### Common Architectures

**Server-Client Architecture** (most common): A database server program manages all data operations, while client programs (your scripts, applications, notebooks) connect to request data or submit changes. The server enforces all rules and access permissions consistently for every client. This is like a library where the librarian (server) manages the books and enforces checkout policies, while patrons (clients) request materials.
**Server-client architecture** (most common): A database server program manages all data operations, while client programs (your scripts, applications, notebooks) connect to request data or submit changes. The server enforces all rules and access permissions consistently for every client. This is like a library where the librarian (server) manages the books and enforces checkout policies, while patrons (clients) request materials.
The two most popular open-source relational database systems: MySQL and PostgreSQL implement a server-client architecture.

**Embedded Databases**: The database engine runs within your application itself—no separate server. This works for single-user applications like mobile apps or desktop software, but doesn't support multiple users accessing shared data simultaneously.
**Embedded databases**: The database engine runs within your application itself—no separate server. This works for single-user applications like mobile apps or desktop software, but doesn't support multiple users accessing shared data simultaneously.
SQLite is a common embedded database @10.14778/3554821.3554842.

**Distributed Databases**: Data and processing are spread across multiple servers working together. This provides high availability and can handle massive scale, but adds significant complexity. Systems like Google Spanner, Amazon DynamoDB, and CockroachDB use this approach.
Expand Down
2 changes: 1 addition & 1 deletion book/20-concepts/01-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ Most importantly, spreadsheets provide no referential integrity. If cell B2 cont

The **relational data model**, introduced by Edgar F. Codd in 1970, revolutionized data management by organizing data into tables (relations) with well-defined relationships. This model emphasizes data integrity, consistency, and powerful query capabilities through a formal mathematical foundation.

The relational model organizes all data into tables representing mathematical relations, where each table consists of rows (representing mathematical *tuples*) and columns (often called *attributes*). Key principles include data type constraints, uniqueness enforcement through primary keys, referential integrity through foreign keys, and declarative queries. The next chapter explores these principles in depth.
The relational model organizes all data into tables representing mathematical relations, where each table consists of rows (representing mathematical *tuples*) and columns (often called *attributes*). Key principles include data type constraints, uniqueness enforcement through primary keys, referential integrity through foreign keys, and declarative query. The next chapter explores these principles in depth.

The most common way to interact with relational databases is through the Structured Query Language (SQL), a language specifically designed to define, manipulate, and query data within relational databases.

Expand Down
10 changes: 1 addition & 9 deletions book/20-concepts/03-relational-practice.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1568,15 +1568,7 @@
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Path Forward: Databases as Workflows\n",
"\n",
"**DataJoint extends relational theory by viewing the schema as a workflow specification.** It preserves all the benefits of relational databases—mathematical rigor, declarative queries, data integrity—while adding workflow semantics that make the database **workflow-aware**.\n",
"\n",
"**Key Insight**: The database schema structure can be identical whether using SQL or DataJoint, although DataJoint imposes some conventions. What's different is the **conceptual view**: SQL sees static entities and relationships; DataJoint sees an executable workflow, where some steps are manual and others are automatic. This workflow view enables automatic execution, provenance tracking, and computational validity—features essential for scientific computing.\n",
"\n",
"The next chapter introduces DataJoint's Relational Workflow Model in detail, showing how Computed tables turn your schema into an executable pipeline specification.\n"
]
"source": "## The Path Forward: Databases as Workflows\n\n**DataJoint extends relational theory by viewing the schema as a workflow specification.** It preserves all the benefits of relational databases—mathematical rigor, declarative queries, data integrity—while adding workflow semantics that make the database **workflow-aware**.\n\n**Key Insight**: The database schema structure can be identical whether using SQL or DataJoint, although DataJoint imposes some conventions. What's different is the **conceptual view**: SQL sees static entities and relationships; DataJoint sees an executable workflow, where some steps are manual and others are automatic. This workflow view enables automatic execution, provenance tracking, and computational validity—features essential for scientific computing.\n\nThe next chapter explores **Data Integrity**—the fundamental constraints that databases enforce to ensure data remains accurate, consistent, and reliable. Understanding these integrity concepts provides the foundation for DataJoint's Relational Workflow Model, which extends integrity guarantees to include workflow validity and computational consistency."
},
{
"cell_type": "code",
Expand Down
Loading