Online Conversion from SQL_ASCII to UTF8 in PostgreSQL

Scripts and source available here: sql_ascii_to_utf8 The Goal To be able to take a Postgres Database which is in SQL_ASCII encoding, and import it into a UTF8 encoded database. Requirements: Python3 (For RHEL/CentOS 7 yum install python34) python-nagiosplugin My pre-built RPMs or pip3 install nagiosplugin PyNagSystemD The Problem PostreSQL will generate errors like this if it encounters any non-UTF8 byte-sequences during a database restore: # pg_dump -Fc test_badchar | pg_restore -d test_badchar_utf8 pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 2839; 0 26852 TABLE DATA table101 postgres pg_restore: [archiver (db)] COPY failed for table "table101": ERROR: invalid byte sequence for encoding "UTF8": 0x91 CONTEXT: COPY table101, line 1 WARNING: errors ignored on restore: 1 And the corresponding data will be omitted from the database (in this case, the whole table, even the rows which did not have a problem): ...

May 23, 2016 ยท 5 min ยท 902 words ยท Sam McLeod