[prev in list] [next in list] [prev in thread] [next in thread]
List: python-list
Subject: Problem reading file with umlauts
From: "Claus Hausberger" <CHausberger () gmx ! de>
Date: 2009-07-07 13:59:49
Message-ID: 20090707135949.100950 () gmx ! net
[Download RAW message or body]
Hello
I have a text file with is encoding in Latin1 (ISO-8859-1). I can't change that as I \
do not create those files myself.
I have to read those files and convert the umlauts like ö to stuff like &oumol; as \
the text files should become html files.
I have this code:
#!/usr/bin/python
# -*- coding: latin1 -*-
import codecs
f = codecs.open('abc.txt', encoding='latin1')
for line in f:
print line
for c in line:
if c == "ö":
print "oe"
else:
print c
and I get this error message:
$ ./read.py
Abc
./read.py:11: UnicodeWarning: Unicode equal comparison failed to convert both \
arguments to Unicode - interpreting them as being unequal if c == "ö":
A
b
c
Traceback (most recent call last):
File "./read.py", line 9, in <module>
print line
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal \
not in range(128)
I checked the web and tried several approaches but I also get some strange encoding \
errors. Has anyone ever done this before?
I am currently using Python 2.5 and may be able to use 2.6 but I cannot yet move to \
3.1 as many libs we use don't yet work with Python 3.
any help more than welcome. This has been driving me crazy for two days now.
best wishes
Claus
--
Neu: GMX Doppel-FLAT mit Internet-Flatrate + Telefon-Flatrate
für nur 19,99 Euro/mtl.!* http://portal.gmx.net/de/go/dsl02
--
http://mail.python.org/mailman/listinfo/python-list
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic