Applying tdd on select-only exercise #219

blackk-foxx · 2025-11-01T14:39:30Z

An alternative to #218, which stores the results of each SELECT statement in a separate table.

blackk-foxx · 2025-11-01T14:41:45Z

exercises/concept/intro-select/intro-select_test.sql

+.read ./intro-select.sql
+
+-- Compare expected vs actual
+.shell diff expected_output.txt user_output.txt


The diff output is certainly not user-friendly. Instead of diff, we could run a custom Python script here that would generate a friendlier diff.

Either way, note that this would work with the existing test runner.

exercises/concept/intro-select/intro-select_test.sql

IsaacG · 2025-11-01T15:35:35Z

I'm not a fan of the multi-file approach.

If we introduce UPDATE, even if we're initially very light on details, we can use the existing framework which works quite well.

This reverts commit 93bc000.

blackk-foxx · 2025-11-05T02:25:09Z

The latest version has the user enter all SELECT statements in a single file and evaluates the results in a single output file. Running sqlite3 < intro-select_test.sql generates a results.json with the content shown below. The message and output values are currently in JSON format, but we can easily convert them to a more user-friendly textual representation like the .mode columns format generated by sqlite.

{
  "version": 3,
  "status": "fail",
  "tests": [
    {
      "description": "ALL records => SELECT * FROM weather_readings",
      "status": "pass"
    },
    {
      "description": "Just location and temperature columns => SELECT location, temperature FROM weather_readings",
      "status": "pass"
    },
    {
      "description": "Without \"FROM\" => SELECT 'Hello, world.'",
      "status": "fail",
      "message": "Expected: [{\"'Hello, world.'\": \"Hello, world.\"}]",
      "output": "Actual: [{\"say_hi\": \"Hello, world.\"}]"
    },
    {
      "description": "All records from Seatle location => SELECT * FROM weather_readings WHERE location = 'Seattle'",
      "status": "pass"
    },
    {
      "description": "All records where humidity in range => SELECT * FROM weather_readings WHERE humidity BETWEEN 60 AND 70",
      "status": "pass"
    },
    {
      "description": "Just location column => SELECT location FROM weather_readings",
      "status": "pass"
    },
    {
      "description": "Only unique locations => SELECT DISTINCT location FROM weather_readings",
      "status": "pass"
    }
  ]
}

blackk-foxx · 2025-11-05T02:29:24Z

exercises/concept/intro-select/test_data.json

+            {"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},
+            {"date":"2025-10-22","location":"Boise","temperature":60.4,"humidity":55},
+            {"date":"2025-10-23","location":"Portland","temperature":54.6,"humidity":70},
+            {"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68},


For some reason, this comes out as 57.79999999999 instead of the original 57.8 in the input data (see data.csv). Any ideas from the SQLite gurus on how to make it come out as 57.8?

Floating point numbers are hard for computers.

SQLite promises to preserve the 15 most significant digits of a floating point value.
ref

Numbers like 57.8 (or 0.8) cannot actually be handled very well by computers using the common representations. The original only looks like 57.8 when input or rounded. Float comparisons generally go better when you compare that the two values are close enough.

You can bypass this problem by picking nicer values, eg fractions which can be represented as a / (2^b).

sqlite> .mode json sqlite> SELECT 57.8; [{"57.8":57.79999999999999716}] sqlite> SELECT 57.5; [{"57.5":57.5}]

IsaacG · 2025-11-05T08:05:33Z

Adding a Python dependency to the test runner makes the image much, much larger. Without Python, apk stats says it's 12MB installed. With python3 that number goes up to 55MB.

I believe much of what you're doing in Python can be done pretty easily using jq. You could also use, say, a bash script plus jq to simplify the tests, eg extract the n'th item from the output JSON and the expected JSON for comparison. --slurp can be helpful here. Possibly also --slurpfile.

Using the test_data.json above and a .mode json style JSON out file,

# out
[{"date":"2025-10-22","location":"Portland","temperature":53.1,"humidity":72},{"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},{"date":"2025-10-22","location":"Boise","temperature":60.4,"humidity":55},{"date":"2025-10-23","location":"Portland","temperature":54.6,"humidity":70},{"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68},{"date":"2025-10-23","location":"Boise","temperature":62.0,"humidity":58}]
[{"location":"Portland","temperature":53.1},{"location":"Seattle","temperature":56.2},{"location":"Boise","temperature":60.4},{"location":"Portland","temperature":54.6},{"location":"Seattle","temperature":57.79999999999999},{"location":"Boise","temperature":62.0}]
[{"'Hello world.'":"Hello, world."}]
[{"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},{"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68}]
[{"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},{"date":"2025-10-23","location":"Portland","temperature":54.6,"humidity":70},{"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68}]
[{"location":"Portland"},{"location":"Seattle"},{"location":"Boise"},{"location":"Portland"},{"location":"Seattle"},{"location":"Boise"}]
[{"location":"Portland"},{"location":"Seattle"},{"location":"Boise"}]

# test.jq
def range(upto):
  def _range:
    if . < upto then ., ((.+1)|_range) else empty end;
  0 | _range;

[
  range($want[0]|length) as $number
  | $want[0] | .[$number | tonumber].expected as $expected
  | $want[0] | .[$number | tonumber].description as $description
  | $out|.[$number | tonumber] as $got
  | if $got != $expected then {$description, $expected, $got, "status": "fail"} else {$description, "status": "pass"} end
]

» jq -c --slurpfile out out --slurpfile want test_data.json -n -f test.jq 2>&1
[
  {"description": "ALL records => SELECT * FROM weather_readings", "status": "pass"},
  {"description": "Just location and temperature columns => SELECT location, temperature FROM weather_readings", "status": "pass"},
  {
    "description": "Without \"FROM\" => SELECT 'Hello, world.'",
    "expected": [{"'Hello, world.'": "Hello, world."}],
    "got": [{"'Hello world.'": "Hello, world."}],
    "status": "fail"
  },
  {"description": "All records from Seatle location => SELECT * FROM weather_readings WHERE location = 'Seattle'", "status": "pass"},
  {"description": "All records where humidity in range => SELECT * FROM weather_readings WHERE humidity BETWEEN 60 AND 70", "status": "pass"},
  {"description": "Just location column => SELECT location FROM weather_readings", "status": "pass"},
  {"description": "Only unique locations => SELECT DISTINCT location FROM weather_readings", "status": "pass"}
]

My main worry here is with lining up the various outputs vs tests in a list. It's not obvious that the SELECT statements must be in a specific order. If a test fails, mapping the test to the SELECT is tricky.

I think this might be viable, but we'd also need some additional guardrails.

A check added to validate the number of entries in the two files match. If the solution has more items than expected, do not run. If the solution has n items which is fewer than the tests, either do not run or run the first n tests.
A test description must be written with the student as the primary audience (in contrast to the descriptions found in many of the problem specs). The test descriptions should also show up as a comment in the stub file. I'm tempted to suggest these should be numbers. That way it's simple to look at the test output and match it to a part of the solution.

-- Do not remove the comments! The comments line up with the test outputs.
-- Write your solution to each task beneath the comment.

-- 1: Select all records.

-- 2: Select the location and temperature.

-- 3: Say Hello!

...

blackk-foxx · 2025-11-05T14:03:35Z

@IsaacG, thanks for the jq prototype. I think it looks promising! I will use your code as a starting point, and work on adding the guardrails you mentioned.

The test descriptions should also show up as a comment in the stub file. I'm tempted to suggest these should be numbers. That way it's simple to look at the test output and match it to a part of the solution.

I agree, and I think the numbers should correspond to individual tasks within the exercise, like the tasks in exercises such as the Role Playing Game in the Elm track (among many others).

- Split exercise into separate tasks - Capture actual output of each task in a separate table - Store expected output for each task in a separate table

blackk-foxx · 2025-11-18T21:56:23Z

Here's my latest iteration, which attempts to correspond with the latest ideas discussed in the forum thread. The major TODO at this point is to generate test results in a way that will work effectively. Any ideas from the reviewers on how to do this?

IsaacG · 2025-11-18T22:14:47Z

What happened to using jq for the testing? If we don't need Python, we shouldn't add it.
Expected test data should be hard coded.
For test failures, you can keep things simple and display both (1) rows which are present but should not be in the output and (2) rows which ought to be present in the results but are missing.

blackk-foxx · 2025-11-19T00:26:01Z

1. What happened to using `jq` for the testing? If we don't need Python, we shouldn't add it.

I agree; the Python file is no longer used -- I just forgot to remove it.

2. Expected test data should be hard coded.
3. For test failures, you can keep things simple and display both (1) rows which are present but should not be in the output and (2) rows which ought to be present in the results but are missing.

OK, I'll give it a try...

- Use only INTEGER data, for simplicity - Add shell script to generate report - Add jq filters to aid in report generation - Store test data in json file, not in db

blackk-foxx · 2025-11-29T17:43:01Z

@IsaacG: here is my latest iteration. The new version relies on a shell script to generate the report, which invokes two different jq filters to render results.json. Test data is now in test_data.json, not in the db.

It's a little convoluted: intro-select_test.sql stores intermediate results in a db file, then launches a shell script, which runs sqlite on the db file. I'm not too worried about that; what do you think?

One TODO item: improve the readability of the message on failures.

The new version generates the following results.json pasted below.

How does this grab you?

{
  "version": 3,
  "status": "fail",
  "tests": [
    {
      "task_id": 1,
      "description": "All data",
      "status": "pass"
    },
    {
      "description": "Hello, world.",
      "status": "fail",
      "message": "Expected [{\"'Hello, world.'\":\"Hello, world.\"}], but got [{\"'Goodbye,Mars.'\":\"Goodbye,Mars.\"}]"
    },
    {
      "task_id": 4,
      "description": "Humidity within range",
      "status": "pass"
    },
    {
      "task_id": 5,
      "description": "Just locations",
      "status": "pass"
    },
    {
      "task_id": 2,
      "description": "Just location and temperature",
      "status": "fail",
      "message": "Expected [{\"location\":\"Portland\",\"temperature\":53},{\"location\":\"Seattle\",\"temperature\":56},{\"location\":\"Boise\",\"temperature\":60},{\"location\":\"Portland\",\"temperature\":54},{\"location\":\"Seattle\",\"temperature\":57},{\"location\":\"Boise\",\"temperature\":62}], but got [{\"temperature\":53},{\"temperature\":56},{\"temperature\":60},{\"temperature\":54},{\"temperature\":57},{\"temperature\":62}]"
    },
    {
      "task_id": 3,
      "description": "Seattle only",
      "status": "pass"
    },
    {
      "task_id": 5,
      "description": "Only unique locations",
      "status": "pass"
    }
  ]
}

IsaacG · 2025-12-01T18:45:56Z

I think this has good potential. The message is a bit messy, but overall I think this approach can work.

Why does the Hello World expected/actual have the string repeated? Would it make sense to format the message using CSV?

IsaacG · 2025-12-01T18:49:30Z

Does this approach hide the fact that the CREATE TABLE expects specific columns that are implicitly being set by the SELECT? Could that create issues? Would students realize there is something magical about the SELECT columns? What if they change the select columns, either to select more/fewer columns or to rename using AS?

blackk-foxx · 2025-12-01T20:21:59Z

I think this has good potential. The message is a bit messy, but overall I think this approach can work.

I'm glad you like it. I agree that the message formatting could use some improvement. I will plan to spend some time improving it in this draft PR. After that, I'm thinking about adding an exercise following this approach to #217. How does that sound?

Why does the Hello World expected/actual have the string repeated? Would it make sense to format the message using CSV?

In general, the tests must ensure that both the column names and the data match expectations. Hello World is a special case; I would not include it in a real exercise since there is already a Hello World exercise.

blackk-foxx · 2025-12-01T20:24:58Z

Does this approach hide the fact that the CREATE TABLE expects specific columns that are implicitly being set by the SELECT? Could that create issues? Would students realize there is something magical about the SELECT columns? What if they change the select columns, either to select more/fewer columns or to rename using AS?

The exercise instructions would specify the required columns and describe the expected data. Each test would compare both the column names and the data against the expected result.

blackk-foxx · 2025-12-01T20:45:33Z

@IsaacG, any ideas on how to correlate user output with the specific test(s) in which it was generated? One way is to have an .output directive directing the output to a task-specific file before each exercise stub in the stub file. But that would be vulnerable to messing up by the user. Or maybe don't correlate the user output at all -- just treat it as global and when we generate the test report, repeat the captured output in all of the failed tests. Other ideas?

jimmytty and others added 2 commits October 26, 2025 19:06

* proof of concept

91e48a5

Generate textual results

1c3d67f

blackk-foxx commented Nov 1, 2025

View reviewed changes

Move each SELECT to separate file

93bc000

blackk-foxx commented Nov 1, 2025

View reviewed changes

exercises/concept/intro-select/intro-select_test.sql Show resolved Hide resolved

blackk-foxx added 2 commits November 3, 2025 16:05

Revert "Move each SELECT to separate file"

ac316d1

This reverts commit 93bc000.

Evaluate results in python

581bda6

blackk-foxx commented Nov 5, 2025

View reviewed changes

blackk-foxx changed the title ~~Applying tdd on select, using textual output~~ Applying tdd on select-only exercise Nov 5, 2025

blackk-foxx added 2 commits November 18, 2025 13:44

Use less-challenging floating point values

20a91ec

Rework test structure

eac0195

- Split exercise into separate tasks - Capture actual output of each task in a separate table - Store expected output for each task in a separate table

blackk-foxx added 2 commits November 29, 2025 09:22

Remove unused file

24d11d9

Rework test report logic

2c68fb2

- Use only INTEGER data, for simplicity - Add shell script to generate report - Add jq filters to aid in report generation - Store test data in json file, not in db

Uh oh!

Applying tdd on select-only exercise #219

Are you sure you want to change the base?

Applying tdd on select-only exercise #219

Uh oh!

Conversation

blackk-foxx commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blackk-foxx Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

blackk-foxx Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

IsaacG commented Nov 1, 2025

Uh oh!

blackk-foxx commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blackk-foxx Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

IsaacG Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

IsaacG Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

IsaacG commented Nov 5, 2025

Uh oh!

blackk-foxx commented Nov 5, 2025

Uh oh!

blackk-foxx commented Nov 18, 2025

Uh oh!

IsaacG commented Nov 18, 2025

Uh oh!

blackk-foxx commented Nov 19, 2025

Uh oh!

blackk-foxx commented Nov 29, 2025

Uh oh!

IsaacG commented Dec 1, 2025

Uh oh!

IsaacG commented Dec 1, 2025

Uh oh!

blackk-foxx commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blackk-foxx commented Dec 1, 2025

Uh oh!

blackk-foxx commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

blackk-foxx commented Nov 1, 2025 •

edited

Loading

blackk-foxx commented Nov 5, 2025 •

edited

Loading

blackk-foxx Nov 5, 2025 •

edited

Loading

blackk-foxx commented Dec 1, 2025 •

edited

Loading