[prev in list] [next in list] [prev in thread] [next in thread] 

List:       sqlite-users
Subject:    Re: [sqlite] method for thousands separator via sed post processing
From:       Rowan Worth <rowanw () dugeo ! com>
Date:       2015-11-27 4:11:03
Message-ID: CAJtcO2SHvKS=mqwGzzDod7z1sbVRZTbBzWe5big1Tuj+sg3Mkw () mail ! gmail ! com
[Download RAW message or body]

Hi Bruce,

On 27 November 2015 at 10:59, Bruce Hohl <brucehohl@gmail.com> wrote:

> Thanks to suggestions on this list I explored the sed post processing
> avenue and found a solution.  There are quite a few sed docs at
> http://sed.sourceforge.net  For my needs I adapted an example from
> http://www-rohan.sdsu.edu/doc/sed.html (These docs are like ancient
> scrolls.)
>
> # add commas to interger strings, changing "1234567" to "1,234,567"
> gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta'
>
> # add commas to numbers with decimal points and minus signs
> gsed ':a;s/\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\1\2,\3/g;ta'
>
> I adjusted the 2nd as follows to remove a space for each comma added which
> preserves column layout:
> sed ':a;s/\( \)\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\2\3,\4/g;ta'
>

This modification is the reason it doesn't match numbers in column 1. But
also it will only insert as many commas as it has spaces to eat. eg. if the
number 1234567.89012 appears four times with zero, one, two, and then three
leading spaces, we get:

1234567.89012
 1234567.89012
 1234,567.89012
 1,234,567.89012

You can improve it by moving the leading space inside the second group and
making it optional, so that it eats spaces when it can but still inserts
commas when it can't:

sed ':a;s/\(^\| \?\([^0-9.]\)\)\([0-9]\+\)\([0-9]\{3\}\)/\2\3,\4/g;ta';

The output becomes:

1,234,567.89012
 1,234,567.89012
 1,234,567.89012
 1,234,567.89012


The date problem is harder to solve, especially with sed. lex makes it
feasible - stick the following into a file called thousandify.lex and
compile it with the command:

flex thousandify.lex && gcc lex.yy.c -o thousandify

Then put the resulting 'thousandify' binary in your path and you can run
sqlite3 test.db | thousandify


/* START thousandify.lex */
%option noyywrap

DATE " "*[0-9]{4}-[0-9]{2}-[0-9]{2}
INTEGER " "*[0-9]+
FRACTION \.[0-9]*([eE][+-]?[0-9]*)?

%%
{DATE}    printf("%s", yytext);
{FRACTION}    printf("%s", yytext);

{INTEGER}    {
    char *cp;
    int i, r, n, len;
    for (cp=yytext; *cp == ' '; ++cp); // skip leading whitespace
    len = strlen(cp);
    r = len % 3;
    n = (len - 1) / 3; // number of commas we'll insert
    for (i = 0; i < (cp - yytext) - n; i++) {
        putchar(' ');
    }
    putchar(cp[0]);
    for (i = 1; i < len; i++) {
        if (--r == 0) {
            putchar(',');
            r = 3;
        }
        putchar(cp[i]);
    }
}

.    printf("%s", yytext);

%%
int
main(int argc, char **argv) {
    yylex();
    return 0;
}
/* END thousandify.lex */


-Rowan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic