Unknown unicode character in EPG title crashes scheduled rec

General support questions (e.g. question about device support problems or general application features/workings).

Unknown unicode character in EPG title crashes scheduled rec

Postby Viewfinder » Sat Mar 23, 2019 10:50 pm

As recently as the last month, I have begun to notice a strange unicode character in some BBC programmes that can appear in either the EPG title or description. If the unicode character appears in the EPG title and is scheduled to record, the record task will attempt to start and abort because of the unknown character. If the character appears in the description, the record will be fine, it seems only if you use %E epg task name variable in your record filename that this character will cause a failed recording.

The character looks very simiilar to unicode 251C box drawing character ├ but not as long and bold but thinner and shorter but not 251D ┝. Somewhere in between. I have tried to paste the unknown character into this post but it always appears as a strange square . Probably because browser doesn't know what it is. Will attach samples of the character as a text file.

Have searched around to see if anyone else has seen this but can't find any mentions. I have tried a different DVB app - TBSViewer which also sees the same character so I suspect it's something perhaps the BBC has introduced. Possibly some kind of protection perhaps?

The character appears to appear where an apostrophe should be, but not always. There are some programmes that contain apostrophes that show fine so not sure if there's any reason why some apostrophes are fine and some are showing an unknown character.

The only workaround so far is to schedule recording to start much earlier than the programme you want to that it adopts the previous epg event name...

Can't seem to upload a txt file attachment as extension txt is not allowed on this forum?
Viewfinder
 
Posts: 9
Joined: Fri Mar 06, 2015 6:05 pm

Re: Unknown unicode character in EPG title crashes scheduled rec

Postby SmartDVB » Sun Mar 24, 2019 9:47 am

Sounds strange. Are those BBC channels using freeview EPG (28.8E)? Perhaps that character could just be sifted if it's not some other unicode conversion bug (i use UTF8 mostly).

Can't seem to upload a txt file attachment as extension txt is not allowed on this forum?


modified the board settings to allow .txt uploads, so you can try again.
SmartDVB
Site Admin
 
Posts: 613
Joined: Sun Feb 01, 2009 5:18 am

Re: Unknown unicode character in EPG title crashes scheduled rec

Postby Viewfinder » Fri Mar 29, 2019 11:02 pm

Thanks, not sure if it is 28.8E as using Free To Air (Freeview) TV and not satellite.

I did a search for that odd character in EPG Search and it has brought many matching programmes up containing that character and it seems it's not just BBC only but others. At first I thought it was BBC only. Also appeared in EPG titles or description on Sony Movie Channel, PBS America and Sports Channel Network just to name a few...

Thanks for updating forum for txt files. Have attached txt file with Unicode txt file containing character in some titles and descriptions. Hope it shows up in the file.

I can't find an exact match in ascii / extended code tables so not sure where it is coming from...

Will try and find an upcoming event with it in and see what the logfile contains when it aborts.

Thanks.
Attachments
Strange unicode characters in EPG.txt
(3.66 KiB) Downloaded 51 times
Viewfinder
 
Posts: 9
Joined: Fri Mar 06, 2015 6:05 pm

Re: Unknown unicode character in EPG title crashes scheduled rec

Postby Viewfinder » Fri Mar 29, 2019 11:27 pm

EPG Search screenshot attached...

I've scheduled that programme on Sony Movie Channel tomorrow so will see what logfile reveals when it aborts due to event name anomaly.
Attachments
EPG Strange Character in title - Copy.png
EPG Strange Character in title - Copy.png (89.52 KiB) Viewed 770 times
Viewfinder
 
Posts: 9
Joined: Fri Mar 06, 2015 6:05 pm

Re: Unknown unicode character in EPG title crashes scheduled rec

Postby Viewfinder » Sat Mar 30, 2019 2:29 pm

Logfile extract from attempted schedule record / fail test of that Sony Movie channel programme A Daughters Conviction... Scheduler has 5 min pre padding so log file is from 9.05.

I thought there might have been a 0k recording file but no file was left behind in recording folder.
Attachments
SmartDVB - Sony Movie Failed.txt
(16.37 KiB) Downloaded 46 times
Viewfinder
 
Posts: 9
Joined: Fri Mar 06, 2015 6:05 pm

Re: Unknown unicode character in EPG title crashes scheduled rec

Postby SmartDVB » Tue Apr 02, 2019 10:16 pm

Logfile extract from attempted schedule record / fail test of that Sony Movie channel programme A Daughters Conviction... Scheduler has 5 min pre padding so log file is from 9.05.

I thought there might have been a 0k recording file but no file was left behind in recording folder.


thanks for this. I managed to find the culprate. Seems to be an EOM (End Of Media character, 0x19 UTF8). What it's doing there i don't know so for now i'll just sift the character out from the recording name when recording as the API fopen calls seem to fail with this EOM char (can't really see it documented in the API's though).
SmartDVB
Site Admin
 
Posts: 613
Joined: Sun Feb 01, 2009 5:18 am

Re: Unknown unicode character in EPG title crashes scheduled rec

Postby Viewfinder » Wed Apr 03, 2019 11:55 am

Great, glad you found the cause, thank you.

Looking at more EPG event listings, it's also a strange coincidence how it only appears where an apostrophe / end quote would be, but in some other events, the apostrophe appears fine.

In the attached, where a description would ordinarily use 'quotes', not sure if it should be single ' or double ", sometimes the open quote is an up arrow character 0x18 and the close quote is that EOM character 0x19.
Then just to confuse us, other quotes appear fine!

For example: The Hangman of Lyon has uparrow 0x18 character open quote and EOM close quote 0x19. Then in another event it quotes 'diamond-horned sea-unicorn' perfectly fine with single apostrophes. Bizarre. It's not an issue if they only appear in the EPG event description and event title is not affected. At the moment I've not seen the uparrow character in event titles but perhaps it might be another character to sift out in case?

As always if you need me to test a fix, let me know. Thanks again.

Edit: According to StackOverflow curly quotes and apostrophes may be the cause. Also Converting curly quotes:

"You can get those values if you copy and paste a Microsoft Word document with smart quotes turned on... These "smart" single quote and "smart" double characters are being stored as hex 18, 19, 1C, 1D"

I wonder if someone behind the scenes at the Freeview/FTA has recently started to copy and paste curly quotes into DVB-T EPG... :?
Attachments
EPG UpArrow EOM apostrophe events.txt
(700 Bytes) Downloaded 38 times
Viewfinder
 
Posts: 9
Joined: Fri Mar 06, 2015 6:05 pm


Return to Support

Who is online

Users browsing this forum: No registered users and 1 guest