[prev in list] [next in list] [prev in thread] [next in thread] 

List:       python-list
Subject:    Re: Parsing strings -> numbers
From:       Duncan Booth <duncan () NOSPAMrcp ! co ! uk>
Date:       2003-11-25 11:31:03
[Download RAW message or body]

tuanglen@hotmail.com (Tuang) wrote in 
news:df045d93.0311250127.67395ae@posting.google.com:

>>>> locale.getdefaultlocale()
> ('en_US', 'cp1252')
>>>> locale.atoi("-12345")
> -12345
> 
> Given the locale it thinks I have, it should be able to parse
> "-12,345" if it can handle formats containing thousands separators,
> but apparently it can't.
> 
> If Python doesn't actually have its own parsing of formatted numbers,
> what's the preferred Python approach for taking taking data, perhaps
> formatted currencies such as "-$12,345.00" scraped off a Web page, and
> turning it into numerical data?
> 

The problem is that by default the numeric locale is not set up to parse 
those numbers. You have to set that up separately:

>>> import locale
>>> locale.getlocale(locale.LC_NUMERIC)
(None, None)
>>> locale.getlocale()
['English_United Kingdom', '1252']
>>> locale.setlocale(locale.LC_NUMERIC, "English")
'English_United States.1252'
>>> locale.atof('1,234')
1234.0
>>> locale.setlocale(locale.LC_NUMERIC, "French")
'French_France.1252'
>>> locale.atof('1,234')
1.234

Unless I've missed something, it doesn't support ignoring currency symbols 
when parsing numbers, so you still can't handle "-$12,345.00" even if you 
do set the numeric and monetary locales.

-- 
Duncan Booth                                             duncan@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
-- 
http://mail.python.org/mailman/listinfo/python-list
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic