ListIt2 - import CSV with non-ascii characters

Have a question or a suggestion about a 3rd party addon module or plugin?
Let us know here.
Post Reply
Cerulean
Forum Members
Forum Members
Posts: 172
Joined: Mon Nov 01, 2010 8:56 am

ListIt2 - import CSV with non-ascii characters

Post by Cerulean »

When importing items into ListIt2 via CSV, it seems that non-ascii characters become garbled. Is there a way to import items containing non-ascii characters successfully?
10010110
Translator
Translator
Posts: 224
Joined: Tue Jan 22, 2008 9:57 am

Re: ListIt2 - import CSV with non-ascii characters

Post by 10010110 »

I suppose this is related to the SQL collation. If you have UTF-8 you shouldn’t usually have any problems. Or what kinds of characters are these?
Cerulean
Forum Members
Forum Members
Posts: 172
Joined: Mon Nov 01, 2010 8:56 am

Re: ListIt2 - import CSV with non-ascii characters

Post by Cerulean »

Any non-ascii characters such as "true" quotes: “”‘’
I don't think it's a database issue because these characters can be entered via the normal ListIt edit screen, it just seems they can't be imported from CSV.
10010110
Translator
Translator
Posts: 224
Joined: Tue Jan 22, 2008 9:57 am

Re: ListIt2 - import CSV with non-ascii characters

Post by 10010110 »

Is the CSV file itself saved in UTF-8? How does it look? Can you provide some sample data?
Cerulean
Forum Members
Forum Members
Posts: 172
Joined: Mon Nov 01, 2010 8:56 am

Re: ListIt2 - import CSV with non-ascii characters

Post by Cerulean »

Yes, the CSV is UTF-8.
An example CSV is here: http://ge.tt/2zbEVGt1/v/0
Result after import is this:
Image
10010110
Translator
Translator
Posts: 224
Joined: Tue Jan 22, 2008 9:57 am

Re: ListIt2 - import CSV with non-ascii characters

Post by 10010110 »

OK, I opened your sample CSV in TextEdit (which is the Notepad equivalent for Mac) and the text in there showed up as

Code: Select all

title,my_field
test item,This ësentenceí contains ìquotesî
Not knowing how to find out the encoding in TextEdit I opened it in TextWrangler which is a full-blown code editor, and voilà, it said the character encoding of the file was “Western (Mac OS Roman)” with line endings being Windows (CRLF). The text in that file showed the same as TextEdit.

To confirm I opened the file in yet another code editor, namely Coda, and there it showed the quotes correctly but still, there it showed the character encoding “Western (Windows Latin 1)”. So, whatever the character encoding is (probably Windows-1252), it is not UTF-8. Whatever program you are using to save that file initially, set it to UTF-8 and you should have no problems with importing anymore.
Cerulean
Forum Members
Forum Members
Posts: 172
Joined: Mon Nov 01, 2010 8:56 am

Re: ListIt2 - import CSV with non-ascii characters

Post by Cerulean »

Thanks for your help with this issue.

What you're saying sounds plausible, but I have tried...
- saving the CSV from Excel (this offers no encoding options, and is probably the source of the encoding issue you picked up, despite the resulting file showing the characters correctly in Notepad and Notepad++).
- saving the CSV from LibreOffice encoded in UTF-8
- saving the CSV from Notepad, encoded in UTF-8
- saving the CSV from Notepad++, encoded in UTF-8, and encoded in UTF-8 without BOM
...and none of these CSVs can be successfully imported into ListIt2 without garbled characters.

If you can provide a test file in UTF-8 I'd be happy to test it (I'm assuming you don't have an instance of ListIt2 handy to test yourself), but it looks to me like this is a genuine bug in ListIt2.
10010110
Translator
Translator
Posts: 224
Joined: Tue Jan 22, 2008 9:57 am

Re: ListIt2 - import CSV with non-ascii characters

Post by 10010110 »

OK, you’re right, it doesn’t work, even with UTF-8 encoded files. However, it does work if you encode the special characters as HTML entities (“ for “left double quote”, ” for “right double quote”, ‘/’ for “left/right single quote”). However, you need to either enclose all field contents in straight quotes (encloser) or change the separator from the default semicolon to something that doesn’t occur in the fields (semicolon is part of the HTML entities so that could be mistaken for field separators).
Cerulean
Forum Members
Forum Members
Posts: 172
Joined: Mon Nov 01, 2010 8:56 am

Re: ListIt2 - import CSV with non-ascii characters

Post by Cerulean »

Thanks for the workaround suggestion.
I have filed a bug report for ListIt2.
chandra

Re: ListIt2 - import CSV with non-ascii characters

Post by chandra »

ListIt2 development is stopped. The follower / fork is named EasyList but works only with CMSMS 2.0.
Cerulean
Forum Members
Forum Members
Posts: 172
Joined: Mon Nov 01, 2010 8:56 am

Re: ListIt2 - import CSV with non-ascii characters

Post by Cerulean »

Yeah, I know :'( . But it's possible that someone might pick up development of the module in the future and so I think it's good to keep a record of bugs.
chandra

Re: ListIt2 - import CSV with non-ascii characters

Post by chandra »

Maybe you know too LI2 does not work with CMSMS 2.0.

The only way is to migrate to EasyList. This should be possible with a data export / import like advised

http://dev.cmsmadesimple.org/feature_request/view/10198

So it's really important import / export function works without any trouble. Would be helpful you try this functionality with EasyList too and post a bug report there if needed.
Post Reply

Return to “Modules/Add-Ons”