![]() To store a given text as a sequence of bytes, we need to choose an encoding. Examples of code points are characters like k, ä or א, as well as special symbols like %, ☢ or □. It seems reasonable to begin with an abstract notion of text as being a sequence of Unicode code points. So maybe we can start by defining "text" data. On the other hand, a distinction between "text" and "non-text" (hereafter: "binary") data seems helpful for programs like grep or diff, if only not to mess up the output of your terminal emulator. Clearly, on a fundamental file-system level, every file is just a collection of bytes and could therefore be viewed as binary data. How do these programs distinguish between "text" and "binary" files?īefore we answer this question, let us first try to come up with a definition. Enter fullscreen mode Exit fullscreen mode
0 Comments
Leave a Reply. |