-
-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.0] turn off URL rewriting for my account #5542
Comments
Copying from @aeharding's direct message:
This would be possible, although it requires some extra effort for every frontend to implement, and we would need a separate database table to store these link mappings. It also wouldnt solve the problem with lemmynsfw in posts, as lemmy-ui running on that instance would still replace the link, so the HTML seen by search engine crawlers would be the same either way.
About this, how exactly would it work? As a setting by the post author which gets federated to all instances? |
I don't see any way around that, which kind of makes this issue impossible to fix. |
If lemmy-ui detects a crawler, it could disable the markdown rewriting, that shouldn't be too bad unless I'm misunderstanding.
That's the only way I can think of. However it does have downsides: breaking linking for non-crawler users for example.
It's true that it would be more work for clients, however you do need to consider that all clients (other than lemmy-ui) need to do some logic for links anyways so they stay in the app/client instead of opening a browser to lemmy-ui. And a simple map should be quite easy for apps to add. |
Even if we moved all this logic to front-ends, how would you prevent any other federated services or apps from rewriting to their own local links? This is probably a losing battle, as you'll never be able to account for every fediverse service or web app. |
Lemmy can't control anything other software does. But I'd also like to flip that on its head. If lemmy does rewrite links, will those rewritten links get federated to other services? I could see that being a problem - for example, say I post on lemmygrad.ml/c/memes:
With string replace, would that get rewritten to:
Would that string replace content be federated to non-lemmy services? For example, would a piefed user see:
If that is the case, all of those non-lemmy instances would see my comment as if I linked to lemmygrad.ml, when I didn't, especially since no other federated services I'm aware of rewrite links in user post/comment body. That is why I think in terms of federation and external services, rewriting URLs is a much more slippery slope than preserving the original user submitted URL. |
I think this is pretty convincing, although it does put a large burden on apps to do all their own link rewriting. The places I can find where we do rewriting in the back-end, are for images, and tracking params.
Any front-end logic to not rewrite links for scrapers only would have to be opened up on those repos. |
That makes sense to me. Regarding images: Can you give an example of how you are imagining providing proxy information for thumbnail_url in GetSite would look like? I was thinking, as an alternative, even image urls could be added to the same map used for links. There would be only one REST API change:
Pseudo response example: Post response on lemmy.ml/c/memes by lemmy.world user, viewed on lemm.ee:
Some benefits:
Potential drawback:
Of course, lemmy could proxy as much or as little information as it wants... maybe we just put fedi urls in the proxy map, not images, to start, idk. |
We'd just expose the Pictrs image mode, then force clients to build their own URLs for every non-local image if proxy mode is turned on. We wouldn't provide any URL maps, clients would have to check those fields (and scan through markdown bodies), to rewrite their own links. |
I see. That could work, although it has the downside of forcing clients to build URLs, which could limit flexibility of Lemmy to change logic in the future. Also, how would that work if Lemmy needs to disable proxying for certain domains like imgur? Would lemmy need to send that domain blacklist in the On the other hand, the logic is much simpler with a https://stackblitz.com/edit/vitejs-vite-rukomjdn?file=src%2Fmain.ts |
Links are only rewritten by your local instance (where your account is registered), and the instance where a post is displayed. So the only way someone could see your post containing a rewritten lemmynsfw link is by viewing your post on that site. There is a link to the original post on your instance included (with Fediverse icon), so its clear that this is changed from what you wrote. Also Lemmy includes link rel="canonical" specifically so that search engines index posts correctly. Honestly your concern about having posts shown containing links with lemmynsfw domain is only a theoretical problem. In practice a much more likely problem is that someone would create an account with your name to impersonate you, and post all kinds of malicious things. Url rewriting and image proxying were implemented like this specifically to make it easier for client devs, so it "just works". Implementing the solutions you suggest would require a lot of work from all client devs, when they already have a lot on their plate for the 1.0 API changes. Its really not worth the effort, so I would close this as wontfix. |
I think there is potential value in only ever using source images in every back end. Not just for app cases like this, but also because it simplifies our code, and federated images and thumbnails will never run into proxy chaining cases when federating out. Take for instance our rewriting markdown bodies, or the complicated thumbnail logic : all that logic could be removed. I'm interested in what others have to say about this (especially other app devs): |
The code would get much more complicated with this because we would have to add new database tables to store url mappings, and include them in the api. The markdown logic would still be needed to generate these mappings in the first place. And when a community forwards a post to followers it uses the original activity without any changes, so there is no proxy chaining in that case. |
I'd say we don't do any URL mappings; each front end might want to use their own unique URLs anyway, so it'd be pointless to do it. As long as we have reference examples in lemmy-ui and jerboa for how to rewrite links, I don't think it should be a problem for others to copy it. |
The problem is that it requires a lot of work to change something that already works perfectly fine. Lots of apps/clients wont have time for that, or wont even know about it. So they wont benefit from these Lemmy features. Not to mention all the work required to implement this in the Lemmy backend. Its really not worth it. |
It would only be deleting things in the back-end, exposing the pictrs image mode in |
No, the backend still has to parse all incoming posts for image links and store them in And for Activitypub links they still have to be dereferenced in the backend, then stored in a table |
I think the only change there, would be that we only ever store real image links, and never proxied ones. On the apub side, we'd only ever send, receive, or store real image links, never proxied ones. |
Requirements
Is your proposal related to a problem?
Example one
Often I don't want lemmy rewriting my lemmy URLs. The biggest reason is not wanting to seem as I may be associated with a given instance.
For example, imagine I make a post:
Imagine google indexes lemmyporn.com. it will be indexed as search engines as this:
I never want this to happen. This would be a really bad look for me, as it would seem like I linked to a porn instance.
Example two
Many times I've linked to a post on a specific instance, for example when explaining how post content hasn't federated properly to X instance, but has to Y instance. In this case, it would be nice if I could turn off rewriting for a specific post.
Describe the solution you'd like.
Ability to turn off URL rewriting for object resolving in post and comment body.
Describe alternatives you've considered.
There is really no alternative. Maybe post the URL in a code block, but then it can't be linked properly.
Additional context
This is a lemmy 1.0.0 thing. This maybe should be a larger discussion, too. In general I am opposed to rewriting links of user content, because it can change the meaning of my post.
Mastodon, for example, does not rewrite URLs, for this reason. It leaves implementation up to clients. I think this is a better approach, because it doesn't change the user's meaning.
The text was updated successfully, but these errors were encountered: