Added test enviroment functionality with test suite format proposition #312

Xaro8 · 2026-01-28T15:19:23Z

Adds ability to test REPL interactions and refactors test runner infrastructure.

Copilot

Pull request overview

This PR adds Python-based test infrastructure to replace the existing bash-based system, introducing REPL (Read-Eval-Print Loop) testing capabilities. The changes enable interactive testing of the REPL environment alongside traditional functional tests.

Changes:

New Python test runner framework in runner/ directory with modular architecture
REPL test support with interactive session management and output validation
Example REPL test demonstrating non-fatal error handling

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 20 comments.

Show a summary per file

File	Description
test/suite	New Python test suite configuration that defines test execution order and flags
test/repl/non_fatal_error0001.fram	Example REPL test file demonstrating error recovery and subsequent execution
runner/runner.py	Main test runner coordinating test execution and result aggregation
runner/repl.py	REPL test execution logic with interactive process management
runner/functional.py	Functional test execution for whole-program testing
runner/common.py	Shared utilities for parsing test files and validating expectations
.gitignore	Added Python bytecode cache directory exclusion

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T15:28:54Z

runner/repl.py

+        print(f"[TIMEOUT] Test '{test['code']}': {e}")
+        return False
+    except Exception as e:
+        print(f"[ERROR] Test '{test['code']}': {e}")


Bare 'except Exception' silently swallows all errors. The exception handler on lines 48-50 catches all exceptions including critical ones like KeyboardInterrupt. This makes debugging difficult as all test failures appear identical. Consider catching specific exceptions or at least logging the exception type and message.

Suggested change

print(f"[ERROR] Test '{test['code']}': {e}")

print(f"[ERROR] Test '{test['code']}': {type(e).__name__}: {e}")

Copilot · 2026-01-28T15:28:54Z

runner/common.py

+    for test in tests:
+        expectations["stdout"].extend(test["expect"]["stdout"])
+        expectations["stderr"].extend(test["expect"]["stderr"])
+        expectations["stdout-empty"] = expectations["stdout-empty"] or test["expect"]["stdout-empty"]
+        expectations["stderr-empty"] = expectations["stderr-empty"] or test["expect"]["stderr-empty"]
+


The OR logic for combining empty expectations is incorrect. On lines 94-95, the code uses OR to combine stdout-empty and stderr-empty flags. This means if any single expectation in the file specifies empty output, it will override all other expectations. This is likely a bug - either these flags should use AND logic (all expectations must agree), or the parsing should reject files that mix empty and non-empty expectations for the same stream.

Suggested change

for test in tests:

expectations["stdout"].extend(test["expect"]["stdout"])

expectations["stderr"].extend(test["expect"]["stderr"])

expectations["stdout-empty"] = expectations["stdout-empty"] or test["expect"]["stdout-empty"]

expectations["stderr-empty"] = expectations["stderr-empty"] or test["expect"]["stderr-empty"]

# Track aggregate emptiness: start assuming streams can be empty,

# then clear the flags if any test expects non-empty output or

# does not mark the stream as empty.

stdout_empty_all = True

stderr_empty_all = True

for test in tests:

# Collect all expected substrings

expectations["stdout"].extend(test["expect"]["stdout"])

expectations["stderr"].extend(test["expect"]["stderr"])

# Combine empty flags with AND semantics

stdout_empty_all = stdout_empty_all and test["expect"]["stdout-empty"]

stderr_empty_all = stderr_empty_all and test["expect"]["stderr-empty"]

# If any explicit non-empty expectation exists for a stream,

# that stream cannot be globally empty.

if test["expect"]["stdout"]:

stdout_empty_all = False

if test["expect"]["stderr"]:

stderr_empty_all = False

# Finalize empty expectations only if all tests agree and there are

# no conflicting non-empty expectations.

if tests:

expectations["stdout-empty"] = stdout_empty_all

expectations["stderr-empty"] = stderr_empty_all

Copilot · 2026-01-28T15:28:55Z

runner/repl.py

+def read_until_prompt(proc, timeout=TIMEOUT):
+    """Reads output until prompt is found or timeout occurs."""
+    output = ""
+    start_time = time.time()
+
+    while time.time() - start_time < timeout:
+        try:
+            c = proc.stdout.read(1)
+            if not c:  # EOF
+                break
+            output += c
+            if output.endswith(PROMPT):
+                return output


Reading output character-by-character is inefficient and may cause performance issues. The function reads one character at a time in a tight loop, which is very slow for larger outputs. Consider reading larger chunks (e.g., using readline() or read() with a buffer size) and then searching for the prompt in the accumulated output.

test/suite

Copilot · 2026-01-28T15:28:55Z

runner/repl.py

+    except Exception:
+        proc.kill()


Another bare 'except Exception' that silently handles all errors. The exception handler on lines 58-59 during process termination catches all exceptions without logging or distinguishing between different error types. This could hide important cleanup failures.

Suggested change

except Exception:

proc.kill()

except subprocess.TimeoutExpired:

# If the process does not exit in time, force kill it.

proc.kill()

except Exception as e:

# Log unexpected termination errors, then ensure the process is killed.

print(f"[WARN] Failed to terminate process cleanly: {e}")

proc.kill()

Copilot · 2026-01-28T15:28:58Z

runner/repl.py

+            output += c
+            if output.endswith(PROMPT):
+                return output
+        except Exception:


Bare 'except Exception' catches all exceptions and may hide important errors. The catch-all exception handler on line 20 could suppress important errors like KeyboardInterrupt or other critical issues. Consider catching more specific exceptions (e.g., OSError, IOError) or at least re-raising certain critical exceptions.

Suggested change

except Exception:

except OSError:

Copilot · 2026-01-28T15:28:58Z

runner/functional.py

+    cmd = f"{binary} {flags} {file}"
+    expectations = parse_expectations(file)
+
+    try:
+        start_time = time.time()
+        proc = subprocess.run(
+            cmd,
+            shell=True,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE,
+            text=True,
+            timeout=timeout,


Using 'shell=True' in subprocess.run is a security risk. Passing shell=True allows shell injection attacks if any of the parameters (binary, flags, or file) contain user-controlled input. Consider using a list of arguments instead of shell=True, for example: subprocess.run([binary] + shlex.split(flags) + [file], ...).

Copilot · 2026-01-28T15:28:58Z

test/suite

+
+with_flags("", simple_examples)
+
+simple_run_tests("test/stdlib/*.fram",flags ='')


Inconsistent spacing around the '=' operator. Line 1 has 'flags = ' with spaces, while line 14 has 'flags =' with space before but not after. This should be consistent throughout the file, using spaces on both sides of the equals sign.

Copilot · 2026-01-28T15:28:58Z

runner/repl.py

+    try:
+        tests = parse_tests(file)
+        read_until_prompt(proc, timeout=timeout)  # consume initial prompt
+
+        for test in tests:
+            if not _run_single_test(proc, test, timeout):
+                return False
+
+        return True


Empty test files will pass silently. If 'parse_tests' returns an empty list (because the file has no code lines), the for loop on line 78 will not execute, and the function will return True, making the test pass. This could hide misconfigured or incorrectly formatted test files. Consider adding a check to ensure at least one test was found, or documenting that empty test files are valid.

Copilot · 2026-01-28T15:28:59Z

test/suite

+print("\n--- REPL TESTS ---\n")
+repl_tests("test/repl/*.fram", flags='')
+print("\n--- REPL TESTS DONE ---\n")
+repl_tests("test/ok/*.fram", flags='')
+print("\n--- REPL TESTS DONE ---\n")


Debug print statements are left in the production code. These print statements outputting "--- REPL TESTS ---" and "--- REPL TESTS DONE ---" should either be removed or made optional through a verbose flag, as they clutter the test output and are not consistent with the existing test infrastructure.

Suggested change

print("\n--- REPL TESTS ---\n")

repl_tests("test/repl/*.fram", flags='')

print("\n--- REPL TESTS DONE ---\n")

repl_tests("test/ok/*.fram", flags='')

print("\n--- REPL TESTS DONE ---\n")

repl_tests("test/repl/*.fram", flags='')

repl_tests("test/ok/*.fram", flags='')

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Added test enviroment functionality with test suite format proposition

e715915

Xaro8 requested a review from Copilot January 28, 2026 15:19

Copilot started reviewing on behalf of Xaro8 January 28, 2026 15:20 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

Update test/suite

814aef1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

ppolesiuk mentioned this pull request Feb 9, 2026

Updated Map tests #322

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added test enviroment functionality with test suite format proposition #312

Added test enviroment functionality with test suite format proposition #312

Uh oh!

Xaro8 commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	print(f"[ERROR] Test '{test['code']}': {e}")
	print(f"[ERROR] Test '{test['code']}': {type(e).__name__}: {e}")

-    for test in tests:
-        expectations["stdout"].extend(test["expect"]["stdout"])
-        expectations["stderr"].extend(test["expect"]["stderr"])
-        expectations["stdout-empty"] = expectations["stdout-empty"] or test["expect"]["stdout-empty"]
-        expectations["stderr-empty"] = expectations["stderr-empty"] or test["expect"]["stderr-empty"]
+    # Track aggregate emptiness: start assuming streams can be empty,
+    # then clear the flags if any test expects non-empty output or
+    # does not mark the stream as empty.
+    stdout_empty_all = True
+    stderr_empty_all = True
+    for test in tests:
+        # Collect all expected substrings
+        expectations["stdout"].extend(test["expect"]["stdout"])
+        expectations["stderr"].extend(test["expect"]["stderr"])
+        # Combine empty flags with AND semantics
+        stdout_empty_all = stdout_empty_all and test["expect"]["stdout-empty"]
+        stderr_empty_all = stderr_empty_all and test["expect"]["stderr-empty"]
+        # If any explicit non-empty expectation exists for a stream,
+        # that stream cannot be globally empty.
+        if test["expect"]["stdout"]:
+            stdout_empty_all = False
+        if test["expect"]["stderr"]:
+            stderr_empty_all = False
+    # Finalize empty expectations only if all tests agree and there are
+    # no conflicting non-empty expectations.
+    if tests:
+        expectations["stdout-empty"] = stdout_empty_all
+        expectations["stderr-empty"] = stderr_empty_all


		with_flags("", simple_examples)

		simple_run_tests("test/stdlib/*.fram",flags ='')

Added test enviroment functionality with test suite format proposition #312

Are you sure you want to change the base?

Added test enviroment functionality with test suite format proposition #312

Uh oh!

Conversation

Xaro8 commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant