You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 22, 2026. It is now read-only.
Then you can open the file `arctic3d-docs/index.html`, which contains all the necessary documentation.
87
87
88
+
## Result interpretation
89
+
90
+
After running ARCTIC-3D, results are stored in the output directory (default: `arctic3d-{uniprot_id}/`). Below is an explanation of each output file and how to interpret them.
91
+
92
+
### Output files
93
+
94
+
| File | Description |
95
+
|------|-------------|
96
+
|`arctic3d.log`| Log file with execution details and warnings |
97
+
|`input_data/`| Directory containing copies of input files |
98
+
|`{pdb_id}_updated.cif`| Structure file downloaded from PDBe (mmCIF format) |
99
+
|`{pdb_id}-{chain}.pdb`| Cleaned PDB structure used for analysis (renumbered to UniProt numbering) |
100
+
|`retrieved_interfaces.out`| All interfaces retrieved from PDBe, listing partner IDs and their residue lists |
101
+
|`interface_matrix.txt`| Pairwise dissimilarity values between all interfaces (used for clustering) |
|`clustered_interfaces.out`| Interfaces grouped into clusters (binding surfaces) |
104
+
|`clustered_residues.out`| Residues belonging to each cluster |
105
+
|`clustered_residues_probs.out`| Residues ranked by probability within each cluster |
106
+
|`{pdb_id}-{chain}_cl{N}.pdb`| PDB structure for cluster N with probabilities encoded in B-factor column |
107
+
|`sequence_probability.html`| Interactive bar plot of per-residue probabilities |
108
+
|`sequence_probability.json`| JSON data for the interactive plot |
109
+
110
+
### Understanding the clustering
111
+
112
+
ARCTIC-3D groups similar interfaces into **binding surfaces** (clusters). Two interfaces are considered similar when they overlap spatially on the protein surface. The dissimilarity is measured using the squared sine of the angle between interface vectors in a Hilbert space representation - values close to 0 indicate overlapping interfaces, while values close to 1 indicate completely distinct regions.
113
+
114
+
The `interface_matrix.txt` file contains the pairwise dissimilarity values in the format:
115
+
```
116
+
interface1 interface2 dissimilarity_value
117
+
```
118
+
119
+
### Interpreting residue probabilities
120
+
121
+
The **probability** (or "contact probability score") represents **the fraction of interfaces within a cluster where a residue is observed**. It is calculated independently for each cluster:
122
+
123
+
```
124
+
probability = (number of interfaces containing the residue) / (total interfaces in cluster)
125
+
```
126
+
127
+
For each cluster, residues are assigned a probability value between 0 and 1:
128
+
129
+
-**Probability = 1.0**: The residue appears in every interface within the cluster (a "hotspot" residue)
130
+
-**Probability = 0.5**: The residue appears in half of the cluster's interfaces
131
+
-**Probability close to 0**: The residue rarely appears at this binding surface
132
+
133
+
**Important**: Probabilities do NOT sum to 1.0 across clusters for a given residue. A residue can have high probability in multiple clusters if it participates in different binding surfaces. For example, a residue with probability 0.8 in cluster 1 and 0.6 in cluster 2 means it appears in 80% of cluster 1's interfaces and 60% of cluster 2's interfaces.
134
+
135
+
The `clustered_residues_probs.out` file lists residues ranked by probability:
136
+
```
137
+
Cluster 1 : 15 residues
138
+
rank resid resname probability
139
+
1 42 ALA 1.000
140
+
2 45 GLU 0.875
141
+
...
142
+
```
143
+
144
+
### Visualizing probabilities in PDB files
145
+
146
+
The output PDB files (`{pdb_id}-{chain}_cl{N}.pdb`) encode probabilities in the B-factor column:
147
+
148
+
- Cluster residues: `B = 50 × (1 + probability)`, ranging from 50 (probability=0) to 100 (probability=1)
149
+
- Non-cluster residues: `B = 0`
150
+
151
+
This allows visualization in molecular viewers (PyMOL, ChimeraX, etc.) using a color spectrum where high B-factors (red) indicate hotspot residues and low values (blue) indicate residues not involved in that binding surface.
152
+
153
+
### Interpreting the dendrogram
154
+
155
+
The dendrogram (`dendrogram_average.png`) shows the hierarchical relationship between all retrieved interfaces. The x-axis represents the dissimilarity between interfaces or groups. Interfaces that merge below the threshold (default: 0.866, corresponding to a 60° angle) form a single binding surface. The threshold can be adjusted with `--threshold` to obtain finer or coarser clustering.
156
+
157
+
### Using the interactive plot
158
+
159
+
Open `sequence_probability.html` in a web browser to explore per-residue binding probabilities. Each cluster is shown as a separate colored bar series, allowing you to identify which residues are involved in which binding surface and compare hotspots across different clusters.
160
+
88
161
## Citing us
89
162
90
163
If you used ARCTIC-3D in your work please cite the following publication:
0 commit comments