[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-users
Subject:    Re: [postgis-users] ST_ClusterDBSCAN: is it deterministic?
From:       Daniel Baston <dbaston () gmail ! com>
Date:       2021-01-24 18:16:02
Message-ID: CA+K_q_oE+bR_H_VLNb4s=OrbbjF+BYjHmCopRGPMEr2GymhWqg () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi Giuseppe,

You can order the inputs by anything you like; OVER(ORDER BY feature_id)
would work just as well. If you have an example that is not deterministic
despite ordered inputs, I'd be curious to see it if you can share.

Thanks,
Dan

On Sun, Jan 24, 2021 at 12:33 PM Giuseppe Broccolo <g.broccolo.7@gmail.com>
wrote:

> Hi Daniel,
>
> Il giorno ven 22 gen 2021 alle ore 18:07 Daniel Baston <dbaston@gmail.com>
> ha scritto:
>
>> It should be deterministic for most real data if the inputs are ordered
>> consistently, using the OVER() clause as you suggest. It's possible that
>> there may be a contrived situation involving duplicates in the input where
>> a result would be different (as GEOS STRtree is using std::sort instead of
>> std::stable_sort), but I'm not sure. Also, there are sometimes multiple
>> possible clusterings that satisfy the DBSCAN algorithm, so it is expected
>> that the results may differ from different implementations or different
>> orderings of the same input.
>>
>
> Thank you for the answer. I think I'll try to define the partition with
> the ORDER BY geom clause in order to check if I can obtain more
> determinism. If I correctly understood, the ORDER BY should add a further
> step with preordering the geometries using an Hilbert curve. Of course,
> this would impact the overall duration of the query.
>
> Giuseppe.
> _______________________________________________
> postgis-users mailing list
> postgis-users@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>

[Attachment #5 (text/html)]

<div dir="ltr"><div>Hi Giuseppe,</div><div><br></div><div>You can order the inputs by \
anything you like; OVER(ORDER BY feature_id) would work just as well. If you have an \
example that is not deterministic despite ordered inputs, I&#39;d be curious to see \
it if you can share.</div><div><br></div><div>Thanks,</div><div>Dan<br></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Jan 24, 2021 at 12:33 \
PM Giuseppe Broccolo &lt;<a \
href="mailto:g.broccolo.7@gmail.com">g.broccolo.7@gmail.com</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div \
dir="ltr">Hi Daniel,<br></div><br><div class="gmail_quote"><div dir="ltr" \
class="gmail_attr">Il giorno ven 22 gen 2021 alle ore 18:07 Daniel Baston &lt;<a \
href="mailto:dbaston@gmail.com" target="_blank">dbaston@gmail.com</a>&gt; ha \
scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>It \
should be deterministic for most real data if the inputs are ordered consistently, \
using the OVER() clause as you suggest. It&#39;s possible that there may be a \
contrived situation involving duplicates in the input where a result would be \
different (as GEOS STRtree is using std::sort instead of std::stable_sort), but \
I&#39;m not sure. Also, there are sometimes multiple possible clusterings that \
satisfy the DBSCAN algorithm, so it is expected that the results may differ from \
different implementations or different orderings of the same \
input.</div></div></blockquote><div><br></div><div>Thank you for the answer. I think \
I&#39;ll try to define the partition with the <span \
style="font-family:monospace">ORDER BY geom</span> clause in order to check if I can \
obtain more determinism. If I correctly understood, the <span \
style="font-family:monospace">ORDER BY</span> should add a further step with \
preordering the geometries using an Hilbert curve. Of course, this would impact the \
overall duration of the \
query.</div><div><br></div><div>Giuseppe.<br></div></div></div> \
_______________________________________________<br> postgis-users mailing list<br>
<a href="mailto:postgis-users@lists.osgeo.org" \
target="_blank">postgis-users@lists.osgeo.org</a><br> <a \
href="https://lists.osgeo.org/mailman/listinfo/postgis-users" rel="noreferrer" \
target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-users</a><br> \
</blockquote></div></div>



_______________________________________________
postgis-users mailing list
postgis-users@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/postgis-users


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic