Quantcast
Channel: SAPIEN Forums
Viewing all articles
Browse latest Browse all 512

PowerShell Studio • Re: Encoding

$
0
0
Unfortunately this does not state any reason for that recommendation. This is from 2019, so five years old. I somewhat object to the notion that one encoding is more 'correct' than another one. Who's the judge? There are certainly cases where one or another may not work, but nothing is listed here. As far as PowerShell Studio is concerned, it should make no difference.

Here are a few pointers about encoding (also for anyone else stumbling across this). These are my opinions based on experience. Use what you need to use for your case.

UTF-8 BOM versus UTF-8 without BOM: The BOM (Byte Order Mark) makes it easy for any application reading or re-writing this file to determine what encoding you want.
If you use different applications to edit, run or modify such a file, defaults should not matter, it is supposed to respect the encoding designated by the BOM.
If you use UTF-8, I always recommend to use the version with BOM. I have rarely found a case where this became an issue. Your mileage of course may vary.

Without a BOM, it becomes guesswork. A standard English language text file without any special characters is technically both a UTF-8 file and a Windows 1252 file at the same time. Upon loading a file, an editor (or any application) must examine each character in that file until a (hopefully) correctly encoded character is found. If there is none, well, then it becomes a matter of opinion so to speak. As you may imagine, the detection process can be flawed in some circumstances. Who's to know what you want it to be.
You will need to make sure that any application you touch this with is set to use UTF-8 by default. And don't let anyone else touch it.

UTF-16 LE also referred to as 'Unicode Little Endian'. Each character is of 16 bit size no matter whether it needs encoding or not. It has a BOM by default (there is no option to skip it) and I generally found it to work with everything. If you routinely mix different languages and symbols in your files, I always recommend to use this. It is the default encoding I use for ALL files I create without running into problems (so far). It does make a code file twice as large compared to its Windows 1252 counterpart, but who cares? Code files are not that large and disk space is usually not a problem anymore. Also of note, when you package your files for PowerShell 7 using our Script Packager, everything is always converted to UTF-16 LE by default.

UTF-16 BE (Big Endian) is a remnant from the time when Apple used processors (Motorola) which encoded the two 8 bit word of a character in a different order than Intel processors.
I will not bore you with listing ye olde mainframe systems also using this byte order :D
If you want to geek out on this, please see here: https://en.wikipedia.org/wiki/Endianness
Mostly unused and should be avoided nowadays. Of note is however that the first version if the PowerShell ISE encoded new files as UTF-16 BE by default, which caused a lot of grief as not every editor did support that at the time. So if you download some really old PowerShell samples, you might encounter it.

Hope this helps.

Statistics: Posted by Alexander Riedel — Mon Jul 15, 2024 5:29 pm



Viewing all articles
Browse latest Browse all 512

Trending Articles