Text Reflowing/Justification #220

oldaccountdeadname · 2021-04-17T19:26:59Z

This fork introduces 'justification' functionality, where a line may be reformatted from

very long paragraph with no linebreaks, exceeding the eighty character line limit of most everywhere

to

very long paragraph with no linebreaks, exceeding the eighty character line
limit of most everywhere

The command will justify around comments, for instance

# very long paragraph with no linebreaks, exceeding the eighty character line limit of most everywhere

to

# very long paragraph with no linebreaks, exceeding the eighty character line
# limit of most everywhere

Line length is determined by the line_length_guide property in the preferences, or a hard-coded value of 80 if the guide is not present. Characters that count as a comment (referred to as a prefix in the justify_text function) are determined by a hard-coded regex, as the line_comment_prefix option is limited to only one style (and thus would break on, for instance, rustdoc comments).

This addresses #219.

christoph-heiss

I just reviewed it to help @jmacdonald a bit.
I made some small comments on things that can be improved, nothing major though.

One overall comment as well: It would be nice if you could squash and regroup all the commits into more clean bits. Maybe like this:

First one adding the comments in src/view/mod.rs and src/models/application/mod.rs.
Next one splitting out range_from() from delete() in src/commands/selection.rs.
After that, one adding the justify command itself + test.
Finally, one for adding the command to the default keymap.

But I can't quite figure out how exactly is supposed to work. More tests should be a must, testing all the possible combinations this command can be used in.
As example, if I select the full line, it works as it supposed to (other than inserting extra newlines at the end, that shouldn't happen). But if I select nothing (or just a few words), it completely messes up. My suggestion would be to either:

Only allow justify() to be called in select_line mode or
Always use the full line range in every allowed mode.

I made a small GIF so you can see what I mean.

src/commands/selection.rs

christoph-heiss · 2021-05-09T11:05:03Z

src/view/mod.rs

-    pub fn new(preferences: Rc<RefCell<Preferences>>, event_channel: Sender<Event>) -> Result<View> {
-        let terminal = build_terminal().chain_err(|| "Failed to initialize terminal")?;
+    /// Return a new view. This will initialize the terminal, load the theme,
+    /// and begin to listen for events.
+    pub fn new(
+        preferences: Rc<RefCell<Preferences>>,
+        event_channel: Sender<Event>
+    ) -> Result<View> {
+        let terminal = build_terminal()
+                       .chain_err(|| "Failed to initialize terminal")?;


Again, completely unrelated change?
Not needed IMHO since this line just about tickles the 100 character barrier, which is reasonable to default to for todays standards.

The comment again is nice and can stay, but the reformatting is somewhat unneeded. Especially since there are more occurences of this sort in this file and elsewhere.

Yeah sorry, I was playing with the codebase a bit trying to understand it, and I think I accidentally commited this instead of starting from the original codebase when I tried to implement line wrapping.

christoph-heiss · 2021-05-09T11:10:20Z

src/models/application/mod.rs

-        let preferences = initialize_preferences();
+        let preferences =
+            Rc::new(RefCell::new(
+                Preferences::load().unwrap_or_else(|_| Preferences::new(None)),
+            ));


I think this should stay as-is. This does not provide any benefits and not having all things crammed into one function is very good for readability and maintainability.

The comment is all well and can stay of course, such things are always nice to have.

src/commands/selection.rs

christoph-heiss · 2021-05-09T11:57:16Z

src/input/key_map/default.yml

@@ -226,6 +226,7 @@ select:
  ctrl-a: selection::select_all
  ctrl-z: application::suspend
  ctrl-c: application::exit
+  a: selection::justify


This probably should be added to the select_line mode as well, would be useful to have there too.

christoph-heiss · 2021-05-09T11:59:54Z

src/commands/selection.rs

+/// around it.
+fn justify_string(text: &String, max_len: usize, potential_prefix: Regex) -> String {
+    let mut justified = String::new();
+    for paragraph in text.split("\n\n") {


Why split only on double newlines? A comment explaining how this works in general would be nice.

src/commands/selection.rs

christoph-heiss · 2021-05-09T13:14:43Z

src/commands/selection.rs

+            justified.push(' ');
+        }
+
+        justified += "\n\n"; // add the paragraph delim


Why add some additional newlines here? When justifying a paragraph I wouldn't expect it to suddenly add superfluous lines after the paragraph.

The whitespace collapsing removes interprets paragraph breaks (\n\n) the same as line breaks and word breaks, so paragraph breaks would be removed without it. However, I can see how that would be very sub-optimal on single paragraphs. I suppose the solution to that is to only add the paragraph breaks if the current paragraph isn't the final. I'll ammend this so that it only runs if the current paragraph is not the last, and add relevant tests.

Thank you for the thorough review, and sorry my code is such a mess!

Ah, I understand! Yeah, that makes sense!
Don't worry, that's what review is for. Thanks for doing the work, it's something very useful to have! 👍

Co-authored-by: Christoph Heiss <[email protected]>

…ex for recognizing comments Co-authored-by: Christoph Heiss <[email protected]>

Co-authored-by: Christoph Heiss <[email protected]>

…d of String::new() Co-authored-by: Christoph Heiss <[email protected]>

Co-authored-by: Christoph Heiss <[email protected]>

src/commands/selection.rs

jmacdonald · 2021-08-08T17:04:09Z

src/commands/selection.rs

+    if app.workspace.current_buffer().is_none() {
+        bail!(BUFFER_MISSING);
+    }


Now that the buffer declaration is only used in a single branch of the match statement, let's remove this guard clause and inline the existence check; see below for the suggestion on that.

Suggested change

if app.workspace.current_buffer().is_none() {

bail!(BUFFER_MISSING);

}

jmacdonald · 2021-08-08T17:04:33Z

src/commands/selection.rs

+    match app.mode {
+        Mode::Select(_) | Mode::SelectLine(_) | Mode::Search(_) => {
+            let delete_range = range_from(app)?;
+            let buffer = app.workspace.current_buffer().unwrap();


Suggested change

let buffer = app.workspace.current_buffer().unwrap();

let buffer = app.workspace.current_buffer().ok_or(BUFFER_MISSING)?;

jmacdonald · 2021-08-08T17:19:57Z

src/commands/selection.rs

+        buffer.insert(
+            &justify_string(
+                &text,
+                app.preferences.borrow().line_length_guide().unwrap_or(80),


We already specify a default value in the preferences file itself. If someone has explicitly disabled the line length guide, let's raise an error here:

Suggested change

app.preferences.borrow().line_length_guide().unwrap_or(80),

app.preferences.borrow().line_length_guide().ok_or("Cannot justify without line length guide")

jmacdonald · 2021-08-08T17:22:27Z

src/commands/selection.rs

+/// Wrap a string at a given maximum length (generally 80 characters). If the
+/// line begins with a comment (matches potential_prefix), the text is wrapped
+/// around it.
+fn justify_string(text: &str, max_len: usize, potential_prefix: &str) -> String {


I think it makes more sense to move the default value handling for a prefix into this function, and the Option type makes the "potential" aspect of this argument self-describing.

Suggested change

fn justify_string(text: &str, max_len: usize, potential_prefix: &str) -> String {

fn justify_string(text: &str, max_len: usize, prefix: Option<&str>) -> String {

jmacdonald

I'm in agreement with @christoph-heiss's feedback, and I've added some of my own as well. That said, this is a feature I didn't know I wanted until this PR; I'd love to get this merged if you have the bandwidth to revisit these suggestions! 😁

Co-authored-by: Jordan MacDonald <[email protected]>

oldaccountdeadname · 2021-08-09T19:08:24Z

I'm in agreement with @christoph-heiss's feedback, and I've added some of my own as well. That said, this is a feature I didn't know I wanted until this PR; I'd love to get this merged if you have the bandwidth to revisit these suggestions! grin

Yes - thanks so much for the feedback! I'm looking back at the code I wrote a few months ago, and just started to rewrite it because, as I'm sure you've noticed, it's pretty bad. I've got a justify function that works properly on strings, but I'm running into some issues when I try to get the contents of a selection.

I wrote a sel_to_range function that takes an &mut Application and attempts to get the selection Range, bail!ing if something goes wrong. However, buffer.read() returns None, implying that I'm getting an improper range. It looks like this:

fn sel_to_range(app: &mut Application) -> std::result::Result<Range, Error> {
    let buf = match app.workspace.current_buffer() {
	Some(b) => b,
	None => bail!(BUFFER_MISSING),
    };
    
    match app.mode {
	Mode::Select(ref sel) => Ok(Range::new(buf.cursor.position, sel.anchor)),
	Mode::SelectLine(ref sel) =>  Ok(sel.to_range(&*buf.cursor)),
	_ => bail!("Can't access a selection outside of select mode.")
    }
}

Then, when I call that with the below, there's a panic due to read(&sel) returning None.

let sel = sel_to_range(app)?;
let buf = app.workspace.current_buffer().unwrap(); // above call guaruntees existence of buffer
let txt = buf.read(&sel).unwrap();

If it's not too much trouble, do you think you'd be able to tell me what's going on here? I'd assume that sel_to_range is returning bad ranges, but I'm not quite sure how it's doing that.

In addition, I'm not quite sure how this should get at prefixes. There are two types of prefixes I'm identifying:

'stable' prefixes: these stay the same on every line, i.e., #, //, and ///.
'variable' prefixes: these are not necessarily the same throughout a paragraph, i.e., /* */ comments, where each line after the first begins with a * instead of a /* .

I believe that supporting just the first would require some modifications to the configuration API, as the line_comment_prefix only allows for one token, and would therefore disallow multiple tokens for a file, such as rustdoc comments.

Supporting the second would be tricky too, as the pattern parameter would have to have state. In a vacuum, I'd go about that by defining a trait Prefix requiring matches(impl AsRef<str>) -> bool and written_prefix() -> String function so that an object can be constructed to match a prefix and say what it should look like at the beginning of a line at any given time, but I'm not at all sure how to reconcile that with Amp's line_comment_prefix configuration API. Thoughts?

jmacdonald · 2021-08-11T04:06:48Z

If it's not too much trouble, do you think you'd be able to tell me what's going on here? I'd assume that sel_to_range is returning bad ranges, but I'm not quite sure how it's doing that.

That logic mirrors what's used for the commands::selection::delete method, which works fine. The range value depends on the text in the buffer, the select mode you're in, and the range you have selected. To start, I would suggest you debug the returned range so you can see if it matches your expectations. A simple solution would be to print the range to stderr, redirect stderr to a file when running amp, and then tailing that file. Then you can see the range values while playing with the selection in the app; you might have stumbled onto an edge case with that logic.

I believe that supporting just the first would require some modifications to the configuration API, as the line_comment_prefix only allows for one token, and would therefore disallow multiple tokens for a file, such as rustdoc comments.

What if we opted for a simple heuristic to start? Something like "if there is a leading symbol/non-alphanumeric character, use that as the prefix". That won't cover all use cases, but if it gets us 90% without needing to use lexical scoping rules and syntax definitions to determine the comment prefix, I think that could be sufficient. That also wouldn't support variable prefixes, but FWIW, I don't use those often enough to see that as a major shortcoming, personally.

oldaccountdeadname · 2021-08-11T18:29:39Z

That logic mirrors what's used for the `commands::selection::delete` method, which works fine.

Not sure what I was doing incorrectly, but I extracted the code from delete and all the tests pass.

What if we opted for a simple heuristic to start? Something like "if there is a leading symbol/non-alphanumeric character, use that as the prefix". That won't cover all use cases, but if it gets us 90% without needing to use lexical scoping rules and syntax definitions to determine the comment prefix, I think that could be sufficient.

Yep, that seems like a good solution. I can't really think of any alphanumeric prefixes that would break this, so I'll start implementing that!

That also wouldn't support variable prefixes, but FWIW, I don't use those often enough to see that as a major shortcoming, personally.

I personally use `/* * */` style comments pretty frequently, and would want to explicitly support them. I agree that the closure solution is *probably* an over-complication, though. What would you think of a hard-coded dictionary of `first line prefix: (middle line prefix, last line prefix)` to handle these things? For example, to support some of the more common comment styles, we could have: ```txt { "/*": (" *", "\n */"), // c-style multiline r#"""""#: ("", r#"""""#), // python-style multiline "#+BEGIN_COMMENT": ("", "\n#+END_COMMENT"), // org-mode multiline ... } ``` I'd imagine it could be pretty small (and thus hard-coded), and, if it's ever too small, woud be able to be refactored into the main config file without too much effort. How's this? *** The code I've got going for this version is available on [gitlab](https://gitlab.com/lincolnauster/amp) just so that I don't overwrite the previous fork yet.

…

-- lincoln auster [they/them]

jmacdonald · 2021-08-16T16:14:28Z

I think I'd prefer to table multi-line comment support for now. I could only find one reflow plug-in for Sublime Text, and it too doesn't support multi-line comments.

The complexity that this functionality would add to Amp wouldn't be worth it, in my opinion. We'd be forced to add another per-type configuration with a complex syntax that users are unlikely to configure (or that we are forced to maintain). Working with multi-line comments is still possible, provided the opening and closing symbols are put on dedicated lines (making the comment content appear without prefixes). In my experience, that's what most best practices encourage, anyway. It does mean that single-line comments in HTML and CSS would need to be manually converted to multi-line beforebeing able to be reflowed, but I think it's okay if the implementation doesn't cover that case.

oldaccountdeadname · 2021-08-16T16:52:08Z

Okay, sounds good! The code on gitlab is feature complete. I'll move that over to github, close this PR, and open one for that.

oldaccountdeadname added 19 commits April 12, 2021 14:36

extra docs

1ec8501

inline extraneous function for clarity

6264898

docs for View::new and shorten lines

5d125f6

introduced (unused) range_from function, statement slide on delete

dd497c9

use range_from function

74ee5fc

move range_from

680fdfe

justify command

93b125f

initial justification code - just breaks on whitespace

4dee3fb

basic greedy justification

a1dc7e0

readd whitespace justification

d6f8172

account for whitespace in justification

bcda5be

justification doesnt trim words

4152494

remove extraneous calls

5ef56ac

keep parargaphs when justifying

34ef570

just test justify function

b4c9861

get line length from line_length_guide

296bd77

reflow around comment marks

2b87ece

parametrize comment regex

e1a68ba

forgot to update the test when extracting the comment prefix regex

060539a

christoph-heiss suggested changes May 9, 2021

View reviewed changes

oldaccountdeadname and others added 10 commits May 9, 2021 12:28

apply suggestion replace panic with bail

14057c9

Co-authored-by: Christoph Heiss <[email protected]>

Update src/commands/selection.rs - use app preferences instead of reg…

fe2ad07

…ex for recognizing comments Co-authored-by: Christoph Heiss <[email protected]>

Update src/commands/selection.rs - unwrap_or instead of match

ccd53b2

Co-authored-by: Christoph Heiss <[email protected]>

Update src/commands/selection.rs - use text length as capacity instea…

8fad821

…d of String::new() Co-authored-by: Christoph Heiss <[email protected]>

Update src/commands/selection.rs - use &str instead of &String

00c701f

Co-authored-by: Christoph Heiss <[email protected]>

use result for range_from

c5811a3

use string over regex

2b51ff9

add test for comment jutsification

fe68b08

Switch to normal mode after reflowing

2b54b67

Co-authored-by: Christoph Heiss <[email protected]>

Merge branch 'master' of https://github.com/lincolnauster/amp

8cdd072

only add extra whitespace when necessary

03dff12

jmacdonald reviewed Aug 8, 2021

View reviewed changes

src/commands/selection.rs Outdated Show resolved Hide resolved

jmacdonald reviewed Aug 8, 2021

View reviewed changes

use ok_or(...) over unwrap()

9dc7eb5

Co-authored-by: Jordan MacDonald <[email protected]>

oldaccountdeadname mentioned this pull request Aug 16, 2021

Text Reflowing/Justification (2) #236

Closed

oldaccountdeadname closed this Aug 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text Reflowing/Justification #220

Text Reflowing/Justification #220

oldaccountdeadname commented Apr 17, 2021

christoph-heiss left a comment

christoph-heiss May 9, 2021

oldaccountdeadname May 9, 2021

christoph-heiss May 9, 2021

christoph-heiss May 9, 2021

christoph-heiss May 9, 2021

christoph-heiss May 9, 2021

oldaccountdeadname May 9, 2021

christoph-heiss May 10, 2021

jmacdonald Aug 8, 2021

jmacdonald Aug 8, 2021

jmacdonald Aug 8, 2021

jmacdonald Aug 8, 2021

jmacdonald left a comment

oldaccountdeadname commented Aug 9, 2021

jmacdonald commented Aug 11, 2021

oldaccountdeadname commented Aug 11, 2021 via email

jmacdonald commented Aug 16, 2021

oldaccountdeadname commented Aug 16, 2021

	if app.workspace.current_buffer().is_none() {
	bail!(BUFFER_MISSING);
	}

	let buffer = app.workspace.current_buffer().unwrap();
	let buffer = app.workspace.current_buffer().ok_or(BUFFER_MISSING)?;

	app.preferences.borrow().line_length_guide().unwrap_or(80),
	app.preferences.borrow().line_length_guide().ok_or("Cannot justify without line length guide")

	fn justify_string(text: &str, max_len: usize, potential_prefix: &str) -> String {
	fn justify_string(text: &str, max_len: usize, prefix: Option<&str>) -> String {

Text Reflowing/Justification #220

Text Reflowing/Justification #220

Conversation

oldaccountdeadname commented Apr 17, 2021

christoph-heiss left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmacdonald left a comment

Choose a reason for hiding this comment

oldaccountdeadname commented Aug 9, 2021

jmacdonald commented Aug 11, 2021

oldaccountdeadname commented Aug 11, 2021 via email

jmacdonald commented Aug 16, 2021

oldaccountdeadname commented Aug 16, 2021