'Re: [webkit-dev] Feedback on Blink's text fragment directive proposal'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       webkit-dev
Subject:    Re: [webkit-dev] Feedback on Blink's text fragment directive proposal
From:       David Bokan <bokan () chromium ! org>
Date:       2020-09-24 19:28:21
Message-ID: CACV-TmKSxVh+tQskqgdD6mVtZ40ywjhpTjFL7qdewhrouhGMew () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

On Wed, Sep 23, 2020 at 3:20 AM Ryosuke Niwa <rniwa@webkit.org> wrote:

>
> On Fri, Sep 18, 2020 at 7:35 AM David Bokan <bokan@chromium.org> wrote:
>
>> Friendly ping to get an answer here.
>>
>> Do my answers above address those points or is there anything else I can
>> clarify?
>>
>> Thanks,
>> David
>>
>> On Mon, Aug 31, 2020 at 1:42 PM David Bokan <bokan@chromium.org> wrote:
>>
>>> [sending (again, sorry) from correct e-mail]
>>>
>>> I think Nick's replies mostly still apply, some updated answers to
>>> those questions.
>>>
>>> (1) We're concerned about compatibility issues in a world where some
>>>> browsers support this but not all. Aware browsers will strip `:~:`, but
>>>> unaware browsers won't. I saw that on the blink-dev ItS thread, it was
>>>> mentioned that at least one site (webmd.com) totally breaks if any
>>>> fragment ID is exposed to the page. This makes it difficult to create a
>>>> link that uses this feature but which is safe in all browsers:
>>>> - Since there is no feature detection mechanism, it's hard for a
>>>> webpage to know whether it should issue such a link. It would have to be
>>>> based on UA string checks, which is regrettable.
>>>> - A link meant for a supporting browser can end up in a non-supporting
>>>> browser, at the very least by copy paste from the URL field, and perhaps
>>>> through other features to share a link.
>>>>
>>>
>>> We do have a feature detection mechanism for this.
>>>
>>> On the latter point, this is true but we think implementing fragment
>>> directive stripping (removing the part after and including `:~:`) is
>>> trivial even if the UA doesn't wish to implement the text-fragment feature.
>>> FWIW, we haven't seen or heard of another such example since.
>>>
>>
> We're continued to be concerned about this backwards compatibility issue.
>

Is there any kind of data we could gather that might allay concerns? Or
mitigations we could consider? Applications that generate these links
dynamically can feature detect for UA support. Pages should already be
considering unexpected hashes; the WebMD so far seems to have been an
outlier.

>
> (3) Text fragment trumping a regular fragment ID seems a bit strange. The
>>>> more natural semantic would be that the text search starts at the fragment,
>>>> so if there are multiple matches it's possible to scroll to a more specific
>>>> one. It's not clear why the fragment is instead entirely ignored.
>>>>
>>>
>>> This was discussed in more detail in issue#75; I agree with Nick's
>>> point that the disambiguation syntax is already specific enough that
>>> starting from a fragment isn't necessary. This also keeps us
>>> mostly-compatible with the TextQuoteSelector specified in
>>> WebAnnotations which I think may have benefits for interaction with
>>> annotation applications.
>>>
>>
> This will limit the utility of this feature. For something as board
> impacting as a URL format change, it seems rather short sighted.
>

Could you elaborate on why you think this limits its utility? From my point
of view keeping them independent is conceptually simpler and more robust
since we don't have to depend on two aspects of the page being unchanged.
Given that the syntax allows precise targeting of ambiguous text snippets I
don't really see a clear downside to this but maybe I'm missing your point?

>
> Also, Web Annotations Data Model allows other kinds of annotations:
> https://www.w3.org/TR/2017/REC-annotation-model-20170223/#selectors
>
> Is there any reason this particular matching algorithm was picked and only
> picked with no possibility of the future extensibility?
>

You mean why of all the selectors specified there only TextQuoteSelector
was chosen? We started with text as we think it's the most useful of the
set but this doesn't preclude eventually adding others. One natural
extension that we've heard demand for is scrolling to images.

Our original exploration
<https://github.com/WICG/scroll-to-text-fragment/#css-selector-fragments>looked
at using arbitrary CSS selectors but this got rather complicated as being
able to target arbitrary parts of the DOM seemed potentially scary
from a security
perspective
<https://github.com/WICG/scroll-to-text-fragment/#security-considerations>
(e.g.
a security flaw might expose CSRF tokens rather than just text).

As to the fragment syntax provided in WebAnnotations, there's two reasons
we chose a different syntax:

  * We needed some way to hide the fragment from the page so that it works
on pages with fragment routing
  * The WebAnnotations fragment syntax is quite verbose. We believe there's
benefit to keeping these links shorter and easier to hand-craft.

However, the model is effectively the same (exception being WebAnnotations
doesn't support start/end ranges); a WebAnnotation TextQuoteSelector can be
mechanically converted to a text-fragment.

>
> - R. Niwa
>
>

[Attachment #5 (text/html)]

<div dir="ltr"><div dir="ltr">On Wed, Sep 23, 2020 at 3:20 AM Ryosuke Niwa &lt;<a \
href="mailto:rniwa@webkit.org">rniwa@webkit.org</a>&gt; wrote:<br></div><div \
class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div \
dir="ltr"><div dir="ltr"><br></div><div class="gmail_quote"><div dir="ltr" \
class="gmail_attr">On Fri, Sep 18, 2020 at 7:35 AM David Bokan \
&lt;<a>bokan@chromium.org</a>&gt; wrote:<br></div><blockquote class="gmail_quote" \
style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr">Friendly ping to get an answer \
here.<div><br></div><div>Do my answers above address those points or is there \
anything else I can clarify?</div><div><br></div><div>Thanks,</div><div>David</div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Aug 31, 2020 at 1:42 PM \
David Bokan &lt;<a>bokan@chromium.org</a>&gt; wrote:<br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>[sending (again, sorry) from \
correct e-mail]</div><div><br></div>I think  <a>Nick&#39;s replies</a>  mostly still \
apply, some updated answers to those questions.<span \
style="color:rgb(80,0,80)"><div><br></div><blockquote class="gmail_quote" \
style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">(1) We're concerned about compatibility issues in \
a world where some browsers support this but not all. Aware browsers will strip \
`:~:`, but unaware browsers won't. I saw that on the blink-dev ItS thread, it was \
mentioned that at least one site (<a>webmd.com</a>) totally breaks if any fragment ID \
is exposed to the page. This makes it difficult to create a link that uses this \
feature but which is safe in all browsers:<br>- Since there is no feature detection \
mechanism, it's hard for a webpage to know whether it should issue such a link. It \
would have to be based on UA string checks, which is regrettable.<br>- A link meant \
for a supporting browser can end up in a non-supporting browser, at the very least by \
copy paste from the URL field, and perhaps through other features to share a \
link.<br></blockquote><div><br></div></span><div>We do have a  <a>feature \
detection</a>  mechanism for this.</div><div><br></div><div>On the latter point, this \
is true but we think implementing fragment directive stripping (removing the part \
after and including `:~:`) is trivial even if the UA doesn&#39;t wish to implement \
the text-fragment feature. FWIW, we haven&#39;t seen or heard of another such example \
since.</div></div></blockquote></div></blockquote><div><br></div><div>We&#39;re \
continued  to be concerned about this backwards compatibility  \
issue.</div></div></div></div></blockquote><div><br></div><div>Is there any kind of \
data we could gather that might allay concerns? Or mitigations we could consider?  \
Applications that generate these links dynamically can feature detect for UA support. \
Pages should already be considering unexpected hashes; the WebMD so far seems to have \
been an outlier.</div><div>  </div><blockquote class="gmail_quote" style="margin:0px \
0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div \
dir="ltr"><div dir="ltr"><div class="gmail_quote"><div><br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span \
style="color:rgb(80,0,80)"><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">(3) Text fragment \
trumping a regular fragment ID seems a bit strange. The more natural semantic would \
be that the text search starts at the fragment, so if there are multiple matches it's \
possible to scroll to a more specific one. It's not clear why the fragment is instead \
entirely ignored.<br></blockquote><div><br></div></span><div>This was discussed in \
more detail in  <a>issue#75</a>; I agree with Nick&#39;s point that the \
disambiguation syntax is already specific enough that starting from a fragment \
isn&#39;t necessary. This also keeps us mostly-compatible with the  \
<a>TextQuoteSelector</a>  specified in WebAnnotations which I think may have benefits \
for interaction with annotation \
applications.</div></div></blockquote></div></blockquote><div><br></div><div>This \
will limit the utility of this feature. For something as board impacting as a URL \
format change, it seems rather short \
sighted.</div></div></div></div></blockquote><div><br></div><div><div>Could you \
elaborate on why you think this limits its utility? From my point of view keeping \
them independent is conceptually simpler and more robust since we don&#39;t have to \
depend on two aspects of the page being unchanged. Given that the syntax allows \
precise targeting of ambiguous text snippets I don&#39;t really see a clear downside \
to this but maybe I&#39;m missing your point?  </div><div></div></div><div>  \
</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px \
solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div \
class="gmail_quote"><div><br></div><div>Also, Web Annotations Data Model allows other \
kinds of annotations:</div><div><a>https://www.w3.org/TR/2017/REC-annotation-model-20170223/#selectors</a><br></div><div><br></div><div>Is \
there any reason this particular matching algorithm was picked and only picked with \
no possibility of the future \
extensibility?</div></div></div></div></blockquote><div><br></div><div><div>You mean \
why of all the selectors specified there only TextQuoteSelector was chosen? We \
started with text as we think it&#39;s the most useful of the set but this \
doesn&#39;t preclude eventually adding others.  One natural extension that we&#39;ve \
heard demand for is scrolling to images.  </div><div><br></div><div>Our <a \
href="https://github.com/WICG/scroll-to-text-fragment/#css-selector-fragments" \
target="_blank">original exploration </a>looked at using arbitrary CSS selectors but \
this got rather complicated as being able to target arbitrary parts of the DOM seemed \
potentially scary from a <a \
href="https://github.com/WICG/scroll-to-text-fragment/#security-considerations" \
target="_blank">security perspective</a>  (e.g. a security flaw might expose CSRF \
tokens rather than just text).</div><div><br></div><div>As to the fragment syntax \
provided in WebAnnotations, there&#39;s two reasons we chose a different \
syntax:</div><div><br></div><div>   * We needed some way to hide the fragment from \
the page so that it works on pages with fragment routing</div><div>   * The \
WebAnnotations fragment syntax is quite verbose. We believe there&#39;s benefit to \
keeping these links shorter and easier to \
hand-craft.</div><div><br></div><div>However, the model is effectively the same \
(exception being WebAnnotations doesn&#39;t support start/end ranges); a \
WebAnnotation TextQuoteSelector can be mechanically converted to a \
text-fragment.</div></div><div>  </div><blockquote class="gmail_quote" \
style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div \
class="gmail_quote"><div><br></div><div>- R. \
Niwa</div><div><br></div></div></div></div> </blockquote></div></div>

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev

[prev in list] [next in list] [prev in thread] [next in thread]