|
| 1 | +# File Location Debug Guide |
| 2 | + |
| 3 | +## Issue Description |
| 4 | + |
| 5 | +When running the SWE Factory tool, files are being created at the repository root instead of in the specified output directories, despite providing the correct `--output-dir`, `--setup-dir`, and `--results-path` parameters. |
| 6 | + |
| 7 | +## Root Cause Analysis |
| 8 | + |
| 9 | +After analyzing the codebase, I found that the system is correctly designed to use the specified output directories. All file writing operations use proper path construction with `pjoin()` or `os.path.join()` to ensure files are written to the correct output directories. |
| 10 | + |
| 11 | +However, there are several potential causes for files appearing in the wrong location: |
| 12 | + |
| 13 | +### 1. Working Directory Changes |
| 14 | +The code uses `cd` context managers in several places (like in `dump_cost` function), which temporarily change the working directory. If any file operations happen outside of these context managers while the directory is changed, they might write to the wrong location. |
| 15 | + |
| 16 | +### 2. Race Conditions |
| 17 | +The system uses multiprocessing, and there might be race conditions where the working directory is changed in one process while another process is writing files. |
| 18 | + |
| 19 | +### 3. Missing Absolute Paths |
| 20 | +Some file operations might not be using absolute paths, causing them to write relative to the current working directory. |
| 21 | + |
| 22 | +## Changes Made |
| 23 | + |
| 24 | +I've made the following improvements to ensure files are written to the correct locations: |
| 25 | + |
| 26 | +### 1. Added Absolute Path Safety Checks |
| 27 | +- Modified `run_raw_task()` to ensure `task_output_dir` is absolute |
| 28 | +- Modified `do_inference()` to ensure `task_output_dir` is absolute |
| 29 | +- Modified `dump_cost()` to ensure `task_output_dir` is absolute |
| 30 | + |
| 31 | +### 2. Added Debug Logging |
| 32 | +- Added logging in `AgentsManager` to track where Dockerfile, eval.sh, and status.json are written |
| 33 | +- Added logging in `TestAnalysisAgent` to track where Dockerfile and eval.sh are written |
| 34 | + |
| 35 | +### 3. Created Debug Script |
| 36 | +- Created `debug_file_locations.py` to monitor file creation during execution |
| 37 | + |
| 38 | +## How to Debug the Issue |
| 39 | + |
| 40 | +### Step 1: Run with Debug Logging |
| 41 | +The enhanced logging will now show exactly where files are being written. Look for log messages like: |
| 42 | +``` |
| 43 | +Writing Dockerfile to: /path/to/output/dir/Dockerfile |
| 44 | +Writing eval.sh to: /path/to/output/dir/eval.sh |
| 45 | +Writing status.json to: /path/to/output/dir/status.json |
| 46 | +``` |
| 47 | + |
| 48 | +### Step 2: Use the Debug Script |
| 49 | +Run the debug script in a separate terminal to monitor file creation: |
| 50 | + |
| 51 | +```bash |
| 52 | +# In one terminal, start the debug script |
| 53 | +python debug_file_locations.py output/swe-factory-runs/kareldb-test 600 10 |
| 54 | + |
| 55 | +# In another terminal, run your command |
| 56 | +LITELLM_API_BASE="https://api.dev.halo.engineer/v1/ai" \ |
| 57 | +OPENAI_API_KEY="${OPENAI_API_KEY?->Need a key}" \ |
| 58 | +PYTHONPATH=. python app/main.py local-issue \ |
| 59 | + --task-id "kareldb-connection-1" \ |
| 60 | + --local-repo "/Users/[email protected]/xynova/kareldb-cp" \ |
| 61 | + --issue-file "input/kareldb_test_issue.txt" \ |
| 62 | + --model google/gemini-2.5-flash \ |
| 63 | + --output-dir "output/swe-factory-runs/kareldb-test" \ |
| 64 | + --setup-dir "output/swe-factory-runs/testbed" \ |
| 65 | + --results-path "output/swe-factory-runs/results" \ |
| 66 | + --conv-round-limit 3 \ |
| 67 | + --num-processes 1 \ |
| 68 | + --model-temperature 0.2 |
| 69 | +``` |
| 70 | + |
| 71 | +The debug script will: |
| 72 | +- Monitor file creation every 10 seconds for 10 minutes |
| 73 | +- Log all new files created |
| 74 | +- Warn about files created outside the expected output directory |
| 75 | +- Show the current working directory at each check |
| 76 | + |
| 77 | +### Step 3: Check the Logs |
| 78 | +Look for: |
| 79 | +1. **Expected behavior**: Files being written to the specified output directory |
| 80 | +2. **Unexpected behavior**: Files being written to the current working directory or repository root |
| 81 | +3. **Working directory changes**: Any unexpected changes in the current working directory |
| 82 | + |
| 83 | +## Expected File Locations |
| 84 | + |
| 85 | +Based on your command, files should be created in: |
| 86 | + |
| 87 | +- **Task output files**: `output/swe-factory-runs/kareldb-test/kareldb-connection-1/` |
| 88 | + - `Dockerfile` |
| 89 | + - `eval.sh` |
| 90 | + - `status.json` |
| 91 | + - `cost.json` |
| 92 | + - `meta.json` |
| 93 | + - `problem_statement.txt` |
| 94 | + - `developer_patch.diff` |
| 95 | + - `info.log` |
| 96 | + - `test_analysis_agent_0/` (subdirectory with test results) |
| 97 | + |
| 98 | +- **Setup directory**: `output/swe-factory-runs/testbed/` |
| 99 | + - Repository clones and working directories |
| 100 | + |
| 101 | +- **Results**: `output/swe-factory-runs/results/results.json` |
| 102 | + - Aggregated results from all tasks |
| 103 | + |
| 104 | +## Troubleshooting |
| 105 | + |
| 106 | +If files are still being created in the wrong location: |
| 107 | + |
| 108 | +1. **Check the debug logs** to see exactly where files are being written |
| 109 | +2. **Verify the output directory exists** and is writable |
| 110 | +3. **Check for any error messages** about directory creation or file writing |
| 111 | +4. **Ensure no other processes** are changing the working directory |
| 112 | +5. **Verify the command line arguments** are being parsed correctly |
| 113 | + |
| 114 | +## Additional Recommendations |
| 115 | + |
| 116 | +1. **Use absolute paths** in your command line arguments |
| 117 | +2. **Ensure the output directories exist** before running the command |
| 118 | +3. **Check file permissions** on the output directories |
| 119 | +4. **Monitor system resources** to ensure there are no disk space issues |
| 120 | + |
| 121 | +## Code Changes Summary |
| 122 | + |
| 123 | +The following files were modified to improve file location handling: |
| 124 | + |
| 125 | +- `app/main.py`: Added absolute path safety checks |
| 126 | +- `app/agents/agents_manager.py`: Added debug logging for file creation |
| 127 | +- `app/agents/test_analysis_agent/test_analysis_agent.py`: Added debug logging for file creation |
| 128 | +- `debug_file_locations.py`: Created debug script for monitoring file creation |
| 129 | +- `FILE_LOCATION_DEBUG.md`: This documentation file |
0 commit comments