[prev in list] [next in list] [prev in thread] [next in thread] 

List:       haskell-cafe
Subject:    Re: [Haskell-cafe] Abandoning String = [Char]?
From:       Mike Meyer <mwm () mired ! org>
Date:       2015-05-22 17:55:14
Message-ID: CAD=7U2CCWqoKf4hamxVaMwXxhy8Y8i4jPH2R+hzA6EbC9d_yJw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Having just finished converting my Haskell shell-scripting tool from Strjng
to Text/ByteString, might I suggest that such a change would create fewer
problems after a Prelude rework to something like ClassyPrelude? Using
ClassyPrelude meant that a lot of the code that worked with String worked
just fine with Text and ByteString. I had more fixes due to having used
partial function than with no longer having List's of chars.

On Fri, May 22, 2015 at 12:37 PM, Andrew Gibiansky <
andrew.gibiansky@gmail.com> wrote:

> Mario,
>
> Thank you for that detailed write-up. That's exactly the sort of thing I
> was looking for.
>
> I imagine a path like the one you describe is possible, but very, very
> difficult, and likely the effort could be better spent elsewhere.
>
> I imagine an alternate route (that would have immediate gains in the near
> future, and wouldn't be a long-term transition plan) would be to have a
> `text-base` package, which exports everything `base` does, exporting `Text`
> instead of `String`. Then base packages off that instead of `base`, thus
> ensuring you do not rely on []-manipulation for `String` (you should still
> have full compatibility with normal `base`).
>
> Anyway, hard choices all around, for no 100% clear gain, so I personally
> do not envision this happening any time soon. Oh well...
>
> -- Andrew
>
> On Fri, May 22, 2015 at 6:07 PM, Michal Antkiewicz <
> mantkiew@gsd.uwaterloo.ca> wrote:
>
>> Mario, thanks for that great writeup.
>>
>> The switch can only happen if there's a way to make the old code somehow
>> transparently work the same or better in the new setup.
>>
>> Maybe some GHC magic could bring the string operations to Prim Ops and
>> transparently switch the underlying representation to Text from [Char].
>> Basically, Text would have to become a built in primitive, not a library.
>>
>> Michał
>>
>> On Fri, May 22, 2015 at 10:29 AM, Mario Blažević <mblazevic@stilo.com>
>> wrote:
>>
>>> On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
>>>
>>>> Hey all,
>>>>
>>>> In the earlier haskell-cafe discussion of IsString, someone mentioned
>>>> that it would be nice to abandon [Char] as the blessed string type in
>>>> Haskell. I've thought about this on and off for a while now, and think
>>>> that the fact that [Char] is the default string type is a really big
>>>> issue (for example, it gives beginners the idea that Haskell is
>>>> incredibly slow, because everything that involves string processing is
>>>> using linked lists).
>>>>
>>>> I am not proposing anything, but am curious as to what already has been
>>>> discussed:
>>>>
>>>> 1. Has the possibility of migrating away from [Char] been investigated
>>>> before?
>>>>
>>>
>>>         No, not seriously as far as I'm aware. That ship has sailed a
>>> long time ago. Still, as I have actually thought about that, I'll give you
>>> an outline of a possible process.
>>>
>>>
>>>  2. What gains could we see in ease of use, performance, etc, if [Char]
>>>> was deprecated?
>>>>
>>>
>>>         They could be very significant for any code that took advantage
>>> of the new type, but the existing code would not benefit that much. But
>>> then, any new Haskell code can already use Text where performance matters.
>>>
>>>
>>>  3. What could replace [Char], while retaining the same ease of use for
>>>> writing string manipulation functions (pattern matching, etc)?
>>>>
>>>
>>>         You would not have the same ease of use exactly. The options
>>> would lie between two extremes. At one end, you can have a completely
>>> opaque String type with fromChars/toChars operations and nothing else. At
>>> the other end, you'd implement all operations useful on strings so there
>>> would never be any need to convert between String and [Char].
>>>
>>>         The first extreme would be mostly useless from the performance
>>> point of view, but with some GHC magic perhaps it could be made a viable
>>> upgrade path. The compiler would have to automatically insert the implicit
>>> fromChars/toChars conversion whenever necessary, and I expect that some of
>>> the existing Haskell code would still be broken.
>>>
>>>         Once you have an opaque String type, you can think about
>>> improving the performance. A more efficient instance of Monoid String would
>>> be a good start, especially since it wouldn't break backward compatibility.
>>> Unfortunately that is the only [Char] instance in wide use that can be
>>> easily optimized. Perhaps Foldable could be made to work with even more
>>> compiler magic, but I doubt it would be worth the effort.
>>>
>>>         If you add more operations on String that don't require
>>>
>>>
>>>  4. Is there any sort of migration path that would make this change
>>>> feasible in mainline Haskell in the medium term (2-5 years)?
>>>>
>>>
>>>         Suppose GHC 7.12 were to bring Text into the core libraries,
>>> change Prelude to declare type String = Text, and sprinkle some magic
>>> compiler dust to make the explicit Text <-> Char conversions unnecessary.
>>>
>>>         The existing Haskell code would almost certainly perform worse
>>> overall. The only improved operations would be mappend on String, and
>>> possibly the string literal instantiation.
>>>
>>>         I don't think there's any chance to get this kind of change
>>> proposal accepted today. You'd have to make the pain worth the gain.
>>> The only viable path is to ensure beforehand that the change improves
>>> more than just the mappend operation.
>>>
>>>         In other words, you'd have to get today's String to instantiate
>>> more classes in common with tomorrow's String, and you'd have to get the
>>> everyday Haskell code to use those classes instead of list manipulations.
>>>
>>>         The first tentative step towards the String type change would
>>> then be either the mono-traversable or my own monoid-subclasses package.
>>> They both define new type classes that are instantiated by both [Char] and
>>> Text. The main difference is that the former builds upon the Foldable
>>> foundation, the latter upon Monoid. They are both far from being a complete
>>> replacement for list manipulations. But any new code that used their
>>> operations would see a big improvement from the String type change.
>>>
>>>         Here, then, is the five-year plan you're asking for:
>>>
>>> Year one: Agree on the ideal set of type classes to bridge the gap
>>> between [Char] and Text.
>>>
>>> Year two: Bring the new type classes into the Prelude. Have all relevant
>>> types instantiate them. Everybody's updating their code in delight to use
>>> the new class methods.
>>>
>>> Year three: GHC issues warnings about using List-specific [], ++, null,
>>> take, drop, span, drop, etc, on String. Everybody's furiously updating
>>> their code.
>>>
>>> Year four: Add Text to the core libraries. The GHC magic to make the
>>> Text <-> [Char] convertions implicit is implemented and ready for testing
>>> but requires a pragma.
>>>
>>> Year five: Update Haskell language report. Flip the switch.
>>>
>>> So there. How feasible does that sound?
>>>
>>>
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe@haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>>>
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>>
>>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>
>

[Attachment #5 (text/html)]

<div dir="ltr">Having just finished converting my Haskell shell-scripting tool from \
Strjng to Text/ByteString, might I suggest that such a change would create fewer \
problems after a Prelude rework to something like ClassyPrelude? Using ClassyPrelude \
meant that a lot of the code that worked with String worked just fine with Text and \
ByteString. I had more fixes due to having used partial function than with no longer \
having List&#39;s of chars.<div><br><div><div class="gmail_extra"><div \
class="gmail_quote">On Fri, May 22, 2015 at 12:37 PM, Andrew Gibiansky <span \
dir="ltr">&lt;<a href="mailto:andrew.gibiansky@gmail.com" \
target="_blank">andrew.gibiansky@gmail.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">Mario,<div><br></div><div>Thank you for that \
detailed write-up. That&#39;s exactly the sort of thing I was looking \
for.</div><div><br></div><div>I imagine a path like the one you describe is possible, \
but very, very difficult, and likely the effort could be better spent elsewhere.  \
</div><div><br></div><div>I imagine an alternate route (that would have immediate \
gains in the near future, and wouldn&#39;t be a long-term transition plan) would be \
to have a `text-base` package, which exports everything `base` does, exporting `Text` \
instead of `String`. Then base packages off that instead of `base`, thus ensuring you \
do not rely on []-manipulation for `String` (you should still have full compatibility \
with normal `base`).  </div><div><br></div><div>Anyway, hard choices all around, for \
no 100% clear gain, so I personally do not envision this happening any time soon. Oh \
well...</div><div><br></div><div>-- Andrew</div></div><div \
class="gmail_extra"><br><div class="gmail_quote">On Fri, May 22, 2015 at 6:07 PM, \
Michal Antkiewicz <span dir="ltr">&lt;<a href="mailto:mantkiew@gsd.uwaterloo.ca" \
target="_blank">mantkiew@gsd.uwaterloo.ca</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">Mario, thanks for that great writeup. \
<br><br>The switch can only happen if there&#39;s a way to make the old code somehow \
transparently work the same or better in the new setup. <br><br>Maybe some GHC magic \
could bring the string operations to Prim Ops and transparently switch the underlying \
representation to Text from [Char]. Basically, Text would have to become a built in \
primitive, not a library.<br><br clear="all"><div class="gmail_extra"><div><div><div \
dir="ltr"><div>Michał <br></div></div></div></div><div><div> <br><div \
class="gmail_quote">On Fri, May 22, 2015 at 10:29 AM, Mario Blažević <span \
dir="ltr">&lt;<a href="mailto:mblazevic@stilo.com" \
target="_blank">mblazevic@stilo.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><span>On 15-05-18 06:44 PM, Andrew Gibiansky wrote:<br> \
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> Hey all,<br>
<br>
In the earlier haskell-cafe discussion of IsString, someone mentioned<br>
that it would be nice to abandon [Char] as the blessed string type in<br>
Haskell. I&#39;ve thought about this on and off for a while now, and think<br>
that the fact that [Char] is the default string type is a really big<br>
issue (for example, it gives beginners the idea that Haskell is<br>
incredibly slow, because everything that involves string processing is<br>
using linked lists).<br>
<br>
I am not proposing anything, but am curious as to what already has been<br>
discussed:<br>
<br>
1. Has the possibility of migrating away from [Char] been investigated<br>
before?<br>
</blockquote>
<br></span>
            No, not seriously as far as I&#39;m aware. That ship has sailed a long \
time ago. Still, as I have actually thought about that, I&#39;ll give you an outline \
of a possible process.<span><br> <br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> 2. What gains could we see in ease of use, performance, etc, \
if [Char]<br> was deprecated?<br>
</blockquote>
<br></span>
            They could be very significant for any code that took advantage of the \
new type, but the existing code would not benefit that much. But then, any new \
Haskell code can already use Text where performance matters.<span><br> <br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> 3. What could replace [Char], while retaining the same ease \
of use for<br> writing string manipulation functions (pattern matching, etc)?<br>
</blockquote>
<br></span>
            You would not have the same ease of use exactly. The options would lie \
between two extremes. At one end, you can have a completely opaque String type with \
fromChars/toChars operations and nothing else. At the other end, you&#39;d implement \
all operations useful on strings so there would never be any need to convert between \
String and [Char].<br> <br>
            The first extreme would be mostly useless from the performance point of \
view, but with some GHC magic perhaps it could be made a viable upgrade path. The \
compiler would have to automatically insert the implicit fromChars/toChars conversion \
whenever necessary, and I expect that some of the existing Haskell code would still \
be broken.<br> <br>
            Once you have an opaque String type, you can think about improving the \
performance. A more efficient instance of Monoid String would be a good start, \
especially since it wouldn&#39;t break backward compatibility. Unfortunately that is \
the only [Char] instance in wide use that can be easily optimized. Perhaps Foldable \
could be made to work with even more compiler magic, but I doubt it would be worth \
the effort.<br> <br>
            If you add more operations on String that don&#39;t require<span><br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> 4. Is there any sort of migration path that would make this \
change<br> feasible in mainline Haskell in the medium term (2-5 years)?<br>
</blockquote>
<br></span>
            Suppose GHC 7.12 were to bring Text into the core libraries, change \
Prelude to declare type String = Text, and sprinkle some magic compiler dust to make \
the explicit Text &lt;-&gt; Char conversions unnecessary.<br> <br>
            The existing Haskell code would almost certainly perform worse overall. \
The only improved operations would be mappend on String, and possibly the string \
literal instantiation.<br> <br>
            I don&#39;t think there&#39;s any chance to get this kind of change \
proposal accepted today. You&#39;d have to make the pain worth the gain.<br> The only \
viable path is to ensure beforehand that the change improves more than just the \
mappend operation.<br> <br>
            In other words, you&#39;d have to get today&#39;s String to instantiate \
more classes in common with tomorrow&#39;s String, and you&#39;d have to get the \
everyday Haskell code to use those classes instead of list manipulations.<br> <br>
            The first tentative step towards the String type change would then be \
either the mono-traversable or my own monoid-subclasses package. They both define new \
type classes that are instantiated by both [Char] and Text. The main difference is \
that the former builds upon the Foldable foundation, the latter upon Monoid. They are \
both far from being a complete replacement for list manipulations. But any new code \
that used their operations would see a big improvement from the String type \
change.<br> <br>
            Here, then, is the five-year plan you&#39;re asking for:<br>
<br>
Year one: Agree on the ideal set of type classes to bridge the gap between [Char] and \
Text.<br> <br>
Year two: Bring the new type classes into the Prelude. Have all relevant types \
instantiate them. Everybody&#39;s updating their code in delight to use the new class \
methods.<br> <br>
Year three: GHC issues warnings about using List-specific [], ++, null, take, drop, \
span, drop, etc, on String. Everybody&#39;s furiously updating their code.<br> <br>
Year four: Add Text to the core libraries. The GHC magic to make the Text &lt;-&gt; \
[Char] convertions implicit is implemented and ready for testing but requires a \
pragma.<br> <br>
Year five: Update Haskell language report. Flip the switch.<br>
<br>
So there. How feasible does that sound?<div><div><br>
<br>
_______________________________________________<br>
Haskell-Cafe mailing list<br>
<a href="mailto:Haskell-Cafe@haskell.org" \
target="_blank">Haskell-Cafe@haskell.org</a><br> <a \
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" \
target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br> \
</div></div></blockquote></div><br></div></div></div></div> \
<br>_______________________________________________<br> Haskell-Cafe mailing list<br>
<a href="mailto:Haskell-Cafe@haskell.org" \
target="_blank">Haskell-Cafe@haskell.org</a><br> <a \
href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" \
target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br> \
<br></blockquote></div><br></div> \
<br>_______________________________________________<br> Haskell-Cafe mailing list<br>
<a href="mailto:Haskell-Cafe@haskell.org">Haskell-Cafe@haskell.org</a><br>
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" \
target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br> \
<br></blockquote></div><br></div></div></div></div>



_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic