Skip to content

Conversation

@blackk-foxx
Copy link

@blackk-foxx blackk-foxx commented Nov 1, 2025

An alternative to #218, which stores the results of each SELECT statement in a separate table.

.read ./intro-select.sql

-- Compare expected vs actual
.shell diff expected_output.txt user_output.txt
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff output is certainly not user-friendly. Instead of diff, we could run a custom Python script here that would generate a friendlier diff.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way, note that this would work with the existing test runner.

@IsaacG
Copy link
Member

IsaacG commented Nov 1, 2025

I'm not a fan of the multi-file approach.

If we introduce UPDATE, even if we're initially very light on details, we can use the existing framework which works quite well.

@blackk-foxx
Copy link
Author

blackk-foxx commented Nov 5, 2025

The latest version has the user enter all SELECT statements in a single file and evaluates the results in a single output file. Running sqlite3 < intro-select_test.sql generates a results.json with the content shown below. The message and output values are currently in JSON format, but we can easily convert them to a more user-friendly textual representation like the .mode columns format generated by sqlite.

{
  "version": 3,
  "status": "fail",
  "tests": [
    {
      "description": "ALL records => SELECT * FROM weather_readings",
      "status": "pass"
    },
    {
      "description": "Just location and temperature columns => SELECT location, temperature FROM weather_readings",
      "status": "pass"
    },
    {
      "description": "Without \"FROM\" => SELECT 'Hello, world.'",
      "status": "fail",
      "message": "Expected: [{\"'Hello, world.'\": \"Hello, world.\"}]",
      "output": "Actual: [{\"say_hi\": \"Hello, world.\"}]"
    },
    {
      "description": "All records from Seatle location => SELECT * FROM weather_readings WHERE location = 'Seattle'",
      "status": "pass"
    },
    {
      "description": "All records where humidity in range => SELECT * FROM weather_readings WHERE humidity BETWEEN 60 AND 70",
      "status": "pass"
    },
    {
      "description": "Just location column => SELECT location FROM weather_readings",
      "status": "pass"
    },
    {
      "description": "Only unique locations => SELECT DISTINCT location FROM weather_readings",
      "status": "pass"
    }
  ]
}

{"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},
{"date":"2025-10-22","location":"Boise","temperature":60.4,"humidity":55},
{"date":"2025-10-23","location":"Portland","temperature":54.6,"humidity":70},
{"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68},
Copy link
Author

@blackk-foxx blackk-foxx Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, this comes out as 57.79999999999 instead of the original 57.8 in the input data (see data.csv). Any ideas from the SQLite gurus on how to make it come out as 57.8?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floating point numbers are hard for computers.

SQLite promises to preserve the 15 most significant digits of a floating point value.
ref

Numbers like 57.8 (or 0.8) cannot actually be handled very well by computers using the common representations. The original only looks like 57.8 when input or rounded. Float comparisons generally go better when you compare that the two values are close enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can bypass this problem by picking nicer values, eg fractions which can be represented as a / (2^b).

sqlite> .mode json
sqlite> SELECT 57.8;
[{"57.8":57.79999999999999716}]
sqlite> SELECT 57.5;
[{"57.5":57.5}]

@blackk-foxx blackk-foxx changed the title Applying tdd on select, using textual output Applying tdd on select-only exercise Nov 5, 2025
@IsaacG
Copy link
Member

IsaacG commented Nov 5, 2025

Adding a Python dependency to the test runner makes the image much, much larger. Without Python, apk stats says it's 12MB installed. With python3 that number goes up to 55MB.

I believe much of what you're doing in Python can be done pretty easily using jq. You could also use, say, a bash script plus jq to simplify the tests, eg extract the n'th item from the output JSON and the expected JSON for comparison. --slurp can be helpful here. Possibly also --slurpfile.

Using the test_data.json above and a .mode json style JSON out file,

# out
[{"date":"2025-10-22","location":"Portland","temperature":53.1,"humidity":72},{"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},{"date":"2025-10-22","location":"Boise","temperature":60.4,"humidity":55},{"date":"2025-10-23","location":"Portland","temperature":54.6,"humidity":70},{"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68},{"date":"2025-10-23","location":"Boise","temperature":62.0,"humidity":58}]
[{"location":"Portland","temperature":53.1},{"location":"Seattle","temperature":56.2},{"location":"Boise","temperature":60.4},{"location":"Portland","temperature":54.6},{"location":"Seattle","temperature":57.79999999999999},{"location":"Boise","temperature":62.0}]
[{"'Hello world.'":"Hello, world."}]
[{"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},{"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68}]
[{"date":"2025-10-22","location":"Seattle","temperature":56.2,"humidity":66},{"date":"2025-10-23","location":"Portland","temperature":54.6,"humidity":70},{"date":"2025-10-23","location":"Seattle","temperature":57.79999999999999,"humidity":68}]
[{"location":"Portland"},{"location":"Seattle"},{"location":"Boise"},{"location":"Portland"},{"location":"Seattle"},{"location":"Boise"}]
[{"location":"Portland"},{"location":"Seattle"},{"location":"Boise"}]
# test.jq
def range(upto):
  def _range:
    if . < upto then ., ((.+1)|_range) else empty end;
  0 | _range;

[
  range($want[0]|length) as $number
  | $want[0] | .[$number | tonumber].expected as $expected
  | $want[0] | .[$number | tonumber].description as $description
  | $out|.[$number | tonumber] as $got
  | if $got != $expected then {$description, $expected, $got, "status": "fail"} else {$description, "status": "pass"} end
]
» jq -c --slurpfile out out --slurpfile want test_data.json -n -f test.jq 2>&1
[
  {"description": "ALL records => SELECT * FROM weather_readings", "status": "pass"},
  {"description": "Just location and temperature columns => SELECT location, temperature FROM weather_readings", "status": "pass"},
  {
    "description": "Without \"FROM\" => SELECT 'Hello, world.'",
    "expected": [{"'Hello, world.'": "Hello, world."}],
    "got": [{"'Hello world.'": "Hello, world."}],
    "status": "fail"
  },
  {"description": "All records from Seatle location => SELECT * FROM weather_readings WHERE location = 'Seattle'", "status": "pass"},
  {"description": "All records where humidity in range => SELECT * FROM weather_readings WHERE humidity BETWEEN 60 AND 70", "status": "pass"},
  {"description": "Just location column => SELECT location FROM weather_readings", "status": "pass"},
  {"description": "Only unique locations => SELECT DISTINCT location FROM weather_readings", "status": "pass"}
]

My main worry here is with lining up the various outputs vs tests in a list. It's not obvious that the SELECT statements must be in a specific order. If a test fails, mapping the test to the SELECT is tricky.

I think this might be viable, but we'd also need some additional guardrails.

  • A check added to validate the number of entries in the two files match. If the solution has more items than expected, do not run. If the solution has n items which is fewer than the tests, either do not run or run the first n tests.
  • A test description must be written with the student as the primary audience (in contrast to the descriptions found in many of the problem specs). The test descriptions should also show up as a comment in the stub file. I'm tempted to suggest these should be numbers. That way it's simple to look at the test output and match it to a part of the solution.
-- Do not remove the comments! The comments line up with the test outputs.
-- Write your solution to each task beneath the comment.

-- 1: Select all records.

-- 2: Select the location and temperature.

-- 3: Say Hello!

...

@blackk-foxx
Copy link
Author

@IsaacG, thanks for the jq prototype. I think it looks promising! I will use your code as a starting point, and work on adding the guardrails you mentioned.

The test descriptions should also show up as a comment in the stub file. I'm tempted to suggest these should be numbers. That way it's simple to look at the test output and match it to a part of the solution.

I agree, and I think the numbers should correspond to individual tasks within the exercise, like the tasks in exercises such as the Role Playing Game in the Elm track (among many others).

- Split exercise into separate tasks
- Capture actual output of each task in a separate table
- Store expected output for each task in a separate table
@blackk-foxx
Copy link
Author

Here's my latest iteration, which attempts to correspond with the latest ideas discussed in the forum thread. The major TODO at this point is to generate test results in a way that will work effectively. Any ideas from the reviewers on how to do this?

@IsaacG
Copy link
Member

IsaacG commented Nov 18, 2025

  1. What happened to using jq for the testing? If we don't need Python, we shouldn't add it.
  2. Expected test data should be hard coded.
  3. For test failures, you can keep things simple and display both (1) rows which are present but should not be in the output and (2) rows which ought to be present in the results but are missing.

@blackk-foxx
Copy link
Author

1. What happened to using `jq` for the testing? If we don't need Python, we shouldn't add it.

I agree; the Python file is no longer used -- I just forgot to remove it.

2. Expected test data should be hard coded.
3. For test failures, you can keep things simple and display both (1) rows which are present but should not be in the output and (2) rows which ought to be present in the results but are missing.

OK, I'll give it a try...

- Use only INTEGER data, for simplicity
- Add shell script to generate report
- Add jq filters to aid in report generation
- Store test data in json file, not in db
@blackk-foxx
Copy link
Author

@IsaacG: here is my latest iteration. The new version relies on a shell script to generate the report, which invokes two different jq filters to render results.json. Test data is now in test_data.json, not in the db.

It's a little convoluted: intro-select_test.sql stores intermediate results in a db file, then launches a shell script, which runs sqlite on the db file. I'm not too worried about that; what do you think?

One TODO item: improve the readability of the message on failures.

The new version generates the following results.json pasted below.

How does this grab you?

{
  "version": 3,
  "status": "fail",
  "tests": [
    {
      "task_id": 1,
      "description": "All data",
      "status": "pass"
    },
    {
      "description": "Hello, world.",
      "status": "fail",
      "message": "Expected [{\"'Hello, world.'\":\"Hello, world.\"}], but got [{\"'Goodbye,Mars.'\":\"Goodbye,Mars.\"}]"
    },
    {
      "task_id": 4,
      "description": "Humidity within range",
      "status": "pass"
    },
    {
      "task_id": 5,
      "description": "Just locations",
      "status": "pass"
    },
    {
      "task_id": 2,
      "description": "Just location and temperature",
      "status": "fail",
      "message": "Expected [{\"location\":\"Portland\",\"temperature\":53},{\"location\":\"Seattle\",\"temperature\":56},{\"location\":\"Boise\",\"temperature\":60},{\"location\":\"Portland\",\"temperature\":54},{\"location\":\"Seattle\",\"temperature\":57},{\"location\":\"Boise\",\"temperature\":62}], but got [{\"temperature\":53},{\"temperature\":56},{\"temperature\":60},{\"temperature\":54},{\"temperature\":57},{\"temperature\":62}]"
    },
    {
      "task_id": 3,
      "description": "Seattle only",
      "status": "pass"
    },
    {
      "task_id": 5,
      "description": "Only unique locations",
      "status": "pass"
    }
  ]
}

@IsaacG
Copy link
Member

IsaacG commented Dec 1, 2025

I think this has good potential. The message is a bit messy, but overall I think this approach can work.

Why does the Hello World expected/actual have the string repeated? Would it make sense to format the message using CSV?

@IsaacG
Copy link
Member

IsaacG commented Dec 1, 2025

Does this approach hide the fact that the CREATE TABLE expects specific columns that are implicitly being set by the SELECT? Could that create issues? Would students realize there is something magical about the SELECT columns? What if they change the select columns, either to select more/fewer columns or to rename using AS?

@blackk-foxx
Copy link
Author

blackk-foxx commented Dec 1, 2025

I think this has good potential. The message is a bit messy, but overall I think this approach can work.

I'm glad you like it. I agree that the message formatting could use some improvement. I will plan to spend some time improving it in this draft PR. After that, I'm thinking about adding an exercise following this approach to #217. How does that sound?

Why does the Hello World expected/actual have the string repeated? Would it make sense to format the message using CSV?

In general, the tests must ensure that both the column names and the data match expectations. Hello World is a special case; I would not include it in a real exercise since there is already a Hello World exercise.

@blackk-foxx
Copy link
Author

Does this approach hide the fact that the CREATE TABLE expects specific columns that are implicitly being set by the SELECT? Could that create issues? Would students realize there is something magical about the SELECT columns? What if they change the select columns, either to select more/fewer columns or to rename using AS?

The exercise instructions would specify the required columns and describe the expected data. Each test would compare both the column names and the data against the expected result.

@blackk-foxx
Copy link
Author

@IsaacG, any ideas on how to correlate user output with the specific test(s) in which it was generated? One way is to have an .output directive directing the output to a task-specific file before each exercise stub in the stub file. But that would be vulnerable to messing up by the user. Or maybe don't correlate the user output at all -- just treat it as global and when we generate the test report, repeat the captured output in all of the failed tests. Other ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants