1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
|
# 27
* Add missing newlines and end-of-files
# 26
* Update comments
* Fix u8c_decode_utf8_length not validating
* Add attribute u8c_DEPRECATED
* Deprecate u8c_encode_utf16 and u8c_encode_utf16_length as they're untested (this is not permanent)
# 25
* Rename source directory: src => source
* Make changelog plain-text
* Update CMake style
* Rewrite in C99
* Use separate CMake lists
* Update copyright and license notices
* Use header identifiers instead of keys for include guards
* Use ifdef/ifndef
* Remove top 'u8c' header (keep uninm)
* License under the LGPL
* Bump required CMake version
* Rename misc header to u8c
* Remove assert
* Remove attributes:
- u8c_attr_abitag
- u8c_attr_allocsz
- u8c_attr_artif
- u8c_attr_cold
- u8c_attr_fmt
- u8c_attr_malloc
- u8c_attr_hot
- u8c_attr_pure
- u8c_attr_retnonnull
- u8c_attr_sect
- u8c_attr_used
- u8c_attr_noderef
- u8c_attr_noesc
- u8c_attr_dup
* Remove type constant macros
* Remove our types
* Remove endian-related facilities
* Remove memory functions
* Remove bytesz and dbg
* Fix version number
* Remove fmt header
* Add new header 'format'
* Add new header 'character'
* Replace utf header with new 'format' and 'character' headers
* Remove math header
* Remove impl header
* Remove cstr header
* Remove arr header
* Make functions non-constexpr
* Update naming convention
* Implement UTF-16 conversions
* Split cnv into multiple functions:
- encode_utf8 (UTF-32 to UTF-8)
- decode_utf8 (UTF-8 to UTF-32)
- encode_utf16 (UTF-32 to UTF-16)
- decode_utf16 (UTF-16 to UTF-32)
* Use caller-provided buffer in conversion functions
* Rename u8c::isupper to u8c_is_majuscule
* Rename u8c::islower to u8c_is_minuscule
* Update code style
* Change type of version constant (now uint_least32_t)
* Use Git tagging for versioning
* Don't throw exceptions
* Update warning flags
* Update optimisation flags
* Rename u8c::unimax to u8c_MAX_CODE_POINT
* Remove u8c::uniblk
* Clean up code
* Don't define functions in headers
* Rename u8c::isspace to u8c_is_whitespace
* Add more characters to u8c_is_whitespace
* Rename u8c::ispunct to u8c_is_punctuation
* Add more characters to u8c_is_punctuation
* Remove u8c::isalnum
* Rename u8c::uninm to u8c_unicode_name
* Use caller-provided buffer in u8c_unicode_name
* Add constant for the maximum length of a Unicode identifier: u8c_MAXIMUM_NAME_LENGTH
* Add functions for determening the length of encodings and decodings:
- u8c_decode_utf8_length
- u8c_encode_utf8_length
- u8c_encode_utf16_length
- u8c_decode_utf16_length
* Rename u8c_attr_const to u8c_UNSEQUENCED
* Rename u8c_attr_inline to u8c_ALWAYS_INLINE
* Add new attribute u8c_NO_DISCARD
* Validate encodings
* Rework readme
* Rename u8c::isdigit and u8x::isxdigit to is_numeric and is_hexadecimal_numeric
* Rename u8c::isalpha to u8c_is_alphabetic
* Rename u8c::iscntrl to u8c_is_control
* Rename u8c::issurro to u8c_is_surrogate
* Optimise code
* Update gitignore
# 24
* Remove constructor taking a single value for `u8c::arr`.
* Add new overload taking a single value for `u8c::arr::app`.
* Add function `u8c::arr::log` to enable logging of array operations *(doesn't currently log a lot)*.
* Remove `u8c::trunc`.
* Initialise memory allocated by `u8c::arr`.
* Add overload taking value used for memory-initialisation for `u8c::arr::alloc`.
# 23
* Rewrite for C++ *(read readme for list of current features)*.
* Use CMake for building.
* Update logo.
# 22
* Remove documentation (too hight-maintainence).
* Rename `include/u8c/is.h` to `include/u8c/chk.h`
* Revert u8c-9 “Remove some optimisations, as they prevent C++ compatibility.”.
* Fix #1.
* Use binary literals for bitwise operations.
* Add more control characters to `u8c_iscntrl`.
* Change type of result of the `u8c_is`* functions fromt `uint_least8_t` to `bool`.
* Add more characters to `u8c_ispunct`.
* Update Makefile.
* Revert u8c-21 “Rename `u8c_unimax` to `u8c_u32max` and move it to `u8c/u32.h`.”.
* Add function for checking if a character is a surrogate point; `u8c_issurro`.
* Split `u8c_isalpha` into `u8c_islower` and `u8c_isupper`, move current mapping to `u8c_islower`. All characters that are neither upper case or lower case must be put in `u8c_isalpha`.
* Add function for getting the name of an Unicode codepoint; `u8c_uninm` (has currently only mapped around ⅔% of all Unicode characters).
* Revert accidental changes to changelog (please be more careful and observant in the future).
* Delete `u8c_errtyp_maxerrtyp` (in favour of `u8c_errtyp_all`).
* Switch the arguments of `u8c_seterr`.
* Add function for getting the name of the block an Unicode codepoint is located in; `u8c_uniblk` (has currently only mapped around 39% of the Unicode blocks).
* Rename all instances of `u32` to `str`.
* Optimise for size (`-Os` instead of `-O3`).
* Update Readme.
* **MAJOR**: Return a tuple (structures) in all returning functions, otherwise void.
* Add help screen to test program.
* Update Gitignore.
* Restructure test program.
* Add more characters to `u8c_islower`.
* Add more characters to `u8c_isupper`.
* Remove the `runtest` target (just use `make && export LD_LIBRARY_PATH=$PWD && ./test`, which can more easily be modified to pass arguments).
* Add more characters to `u8c_isalpha`.
* Fix incorrect error being set (somewhere, I forgot where).
* Fix `SIZE_C`.
# 21
* Update readme.
* Require C23 (C2x).
* Use GCC (has better C23 support).
* Cleanup UTF-8 decoder and encoder (using binary literals).
* Rename `u8c_unimax` to `u8c_u32max` and move it to `u8c/u32.h`.
* Don't clear last error message on calls to `u8c_geterr`.
* Restructure source files.
* Fix makefile.
* Update test program.
# 20
* Update documentation.
* Optimise `u8c_println`,
* Make `u8c_print`, `u8c_println`, and `u8c_vprint` thread-safe (if thread-safe is not disabled).
* Create base for UTF-16 related functions:
* Add function for allocating UTF-16 strings; `u8c_u16alloc`.
* Add function for deallocating UTF-16 strings; `u8c_u16free`.
* Restructure headers:
* `u8c/SIZE_C.h`:
* `SIZE_C`
* `u8c/err.h`:
* `u8c_errtyp`
* `u8c_errhandltyp`
* `u8c_geterr`
* `u8c_regerrhandl`
* `u8c_seterr`
* `u8c/fmt`:
* `u8c_dbgprint`
* `u8c_fmt`
* `u8c_fmttyp`
* `u8c_print`
* `u8c_println`
* `u8c_setfmt`
* `u8c_vfmt`
* `u8c_vprint`
* `u8c/is.h`:
* `u8c_isalnum`
* `u8c_isalpha`
* `u8c_iscntrl`
* `u8c_isdigit`
* `u8c_ispunct`
* `u8c_isspace`
* `u8c_isxdigit`
* `u8c/main.h`:
* `u8c_dbg`
* `u8c_abrt`
* `u8c_end`
* `u8c_init`
* `u8c_thrdsafe`
* `u8c_unimax`
* `u8c_ver`
* `u8c/u16.h`:
* `u8c_u16alloc`
* `u8c_u16free`
* `u8c/u32.h`:
* `u8c_u32alloc`
* `u8c_u32cat`
* `u8c_u32cmp`
* `u8c_u32cp`
* `u8c_u32fndchr`
* `u8c_u32fndpat`
* `u8c_u32free`
* `u8c_u32ins`
* `u8c_u32substr`
* `u8c_u32sz`
* `u8c/u8.h`:
* `u8c_u8alloc`
* `u8c_u8dec`
* `u8c_u8enc`
* `u8c_u8free`
* Disable *-Wpadded*.
* Update `SIZE_C`.
* Always use character constants (instead of numerical values).
# 1↋
* Add more codepoints to `u8c_ispunct`.
* Don't use Zstandard for man page compression. Use Gzip.
* Remove Zstandard as a dependency.
* Update gitignore.
# 1↊
* Initialise error handler array.
* Initialise and destroy error handler array mutex.
* Fix Makefile.
* Update gitignore.
# 19
* Fix error when compiling with GCC: *src/u8c/dat.c:22:29: error: initializer element is not constant [-Wpedantic]*.
* Improve error handling:
* Add enumeration type for error types; `u8c_errtyp`.
* Add function for registering error handlers; `u8c_regerrhandl` (see `test.c`).
* Add function for inserting UTF-32 strings into UTF-32 strings; `u8c_u32ins`.
* Enable more warnings.
* Add man pages.
* Fix `u8c_u32cat` skipping the last character in `lstr`.
* Remove the `uninstall` target (it was deemed to unsafe).
* Add *Zstandard* as a dependency.
# 18
* Update `.gitignore`.
* Remove core dump.
# 17
* Create new logo.
* Update headers.
* Invert status values.
# 16
* Add function for concatenating UTF-32 strings; `u8c_u32cat`.
* Add functions for allocating UTF-32 and UTF-8 strings; `u8c_u32alloc` and `u8c_u8alloc`.
* Add function for finding a given codepoint in an UTF-32 string; `u8c_u32fndchr`.
* Add function for finding a given pattern (string) in an UTF-32 string; `u8c_u32fndpat`.
* Update `SIZE_C`.
* Add function for aborting; `u8c_abrt`.
* Rename `u8c_debug` to `u8c_dbg`.
* Use `bool` (`_Bool`) for return values instead of `uint_least8_t`.
* Add more format types.
* Fix incorrect unabbreviated names in headers.
* Add another function from `ctypes.h`; `u8c_isxdigit`.
* Use `char32_t` (from `uchar.h`) instead of `uint_least32_t` in UTF-32.
* Use `unsigned char` instead of `uint_least8_t` in UTF-8.
* Move all data into `u8c_dat` (of type `struct u8c_dattyp`).
* Add function for setting the format (base and endian) of `u8c_fmt` and company; `u8c_setfmt`.
* Remove `u8c_txt` in favour of Unicode string literals (much clearer code, but less portable).
* Add function for getting a sub-string of an UTF-32 string; `u8c_u32substr`.
* Don't count the null-terminator in string sizes.
* Add macro for maximum valid Unicode codepoint; `u8c_unimax`.
* Remove `txttolit`.
* Add function for deallocating UTF-8 strings; `u8c_u8free`.
* Turn both `u8c_dbg` and `u8c_thrdsafe` into type `bool` from `uint_least8_t`.
* Add more letters to `u8c_isalpha`.
# 15
* Add missing include directives to `include/u8c/u32free.h` and `include/u8c/u8free.h`.
# 14
* Free error message when `u8c_geterr` is called (after copying, of course).
* Update `u8c_freeu8`.
* Rename `u8c_freeu32` to and `u8c_u32free` and `u8c_freeu8` to `u8c_u8free`.
# 13
* Fix `u8c_txt` in C++.
# 12
* Use `u8c_println` instead of `u8c_print` in `u8c_dbgprint`.
# 11
* Update README.
* Update Makefile.
* Use constant variables more.
* Create macro for creating human-readable UTF-32 strings; `u8c_txt`.
* Add macros for deallocating UTF-32 and UTF-8 strings (use these instead of `free` or `std::free`); `u8c_u32free` and `u8c_u8free`.
* Optimisations.
# 10
* Make `u8c_seterr` public.
* Update logo.
# ↋
* Fix `u8c_ver`.
* Add Turkish letters to `u8c_isalpha`.
# ↊
* Add UTF-32 alternatives to some of the functions from `ctypes.h`; `u8c_isalnum`, `u8c_isalpha`, `u8c_iscntrl`, `u8c_isdigit`, `u8c_isspace`, and `u8c_ispunct`.
* Add assertions.
* Create new logo (move old logo to `u8c_old.svg`).
# 9
* Optimisations.
* Remove some optimisations, as they prevent C++ compatibility.
* Fix memory leak in test program.
* Add function for comparing UTF-32 strings; `u8c_u32cmp`.
# 8
* Optimisations.
# 7
* Optimisations.
# 6
* Remove unneeded include directives.
* Update `SIZE_C` to utilise the C++23 `std::size_t` literal suffix (`uz`).
* Fix include guard in `include/u8c/stat.h`.
* Add more error messages.
# 5
* Add logo (`u8c.svg`).
* Fix UTF-8 decoder.
* Update README.
# 4
* Add link to PKGBUILD in README.
# 3
* Remove `include/u8c.h`.
* Fix minor errors.
# 2
* Move PKGBUILD to seperate project.
* Create a *master* header that includes every other header.
* Create `size_t` constant macro.
* Add colour to text formatting.
* Specify what file to print to in `u8c_print`.
* Add C++ compatibility to headers.
* Add functions for initialising and ending *u8c*.
* Add dedicated function for text formatting (altough not working at the moment); `u8c_fmt`.
* Remove `u8c_free`.
* Improve error handling in the form of error messages (which can be retrieved by the programmer via `u8c_geterr`).
* Split `u8c_print` into two functions: `u8c_print` and `u8c_println`.
* Break the UTF-8 decoder somehow.
* Add compile-time option to be thread-safe (requiring C11 threads if enabled).
* Add function for copying UTF-32; `u8c_u32cp`.
* Add function for getting the size of an UTF-32 array; `u8c_u32sz`.
* Add `va_list` alternatives to `u8c_fmt` and `u8c_print`; `u8c_vfmt` and `u8c_vprint`.
* Add test-program (run via `make runtest`).
* Add program to make human-readable UTF-32 strings machine-readable.
* Turn `u8c_ver` into a compile-time macro.
* Enable more warnings when compiling.
* Add assertions.
# 1
* Fix Makefile compiling *u8c.so* instead of *libu8c.so*.
* Add global variable for version control.
# 0
* Fork from *luma*.
|