Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support literals encoding conversions according to the literal type. #162

Open
wants to merge 189 commits into
base: gcos4gnucobol-3.x
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
189 commits
Select commit Hold shift + click to select a range
ebc0e65
GIT-specific settings, with CI setup and github workflow for Ubuntu, …
ddeclerck Feb 4, 2022
dfb449a
[GCOS] Add GCOS configuration file
ddeclerck Feb 4, 2022
50aedb0
Redispatch ChangeLog entries
nberth Jan 28, 2022
dd0e532
Thanks OCamlPro contributors
nberth Mar 29, 2022
acc6b97
Improve setup for CI jobs, with temporary focus on branch `gcos4gnuco…
nberth May 19, 2022
05bdc7b
Use github actions to emit a distribution archive and test logs
nberth May 20, 2022
270a5c6
Use ubuntu CI to measure coverage
nberth Jul 7, 2022
b183d9f
Adjust macos & windows CIs
nberth Jul 7, 2022
45befa5
Disable automated windows CI workflow
nberth Jul 8, 2022
0e2b01c
Fix handling of quotes in testsuite artifact name
nberth Jul 20, 2022
85251d9
Fix ubuntu CI
nberth Sep 30, 2022
ecbb7d7
Add working Github action files, except for Windows (#79)
lefessan Jan 23, 2023
54e2a7a
CI: check for c89 declaration (#97)
GitMensch Jun 13, 2023
f36a5c2
Add/Update Windows workflows
ddeclerck Mar 18, 2024
9163932
Update MSVC & MSYS1 CI
ddeclerck Jul 31, 2024
6019858
Upgrade versions of github actions used in CI
nberth Aug 20, 2024
4bfd3d5
Adjust MSVC workflow now that error popups are disabled by default
nberth Aug 23, 2024
64a6827
Further CI adjustments
nberth Aug 21, 2024
7688453
Enforce a 45 minutes timeout on Windows CI workflows
nberth Aug 26, 2024
f314e26
Stop using `-pedantic` flag in Coverage and Warnings workflow
nberth Aug 27, 2024
a0cc8e4
Various improvements in Warnings and Coverage workflow
nberth Aug 27, 2024
964d211
Cache `newcob.val` instead of an archive
nberth Aug 28, 2024
a5e2c4d
Run testsuite even on Debug target in Windows MSVC CI
nberth Aug 28, 2024
ad2d18d
Cache `newcob.val` in Windows MSYS2 workflow as well
nberth Aug 28, 2024
5ebda12
Update Windows workflows (upload testsuite.log on failure)
ddeclerck Sep 22, 2024
a23df52
CI: adding minimal build
GitMensch Sep 27, 2024
554e989
install cobc
AhmedMaher309 Jun 26, 2024
9dd561d
update tree.c/build_literal
AhmedMaher309 Jun 26, 2024
240441b
refactor the build_literal function
AhmedMaher309 Jun 26, 2024
4906c62
Delete .local directory
AhmedMaher309 Jun 26, 2024
3a3516b
remove configs
AhmedMaher309 Jun 26, 2024
066c355
added switch cases on category to build_literal fucntion
AhmedMaher309 Jun 27, 2024
310100e
Handling the case of N'text' in National literals, but not space padd…
AhmedMaher309 Jun 30, 2024
ca01844
handling N'text' case in National literals but no space padding conve…
AhmedMaher309 Jun 30, 2024
adb5437
added the parts of handling the padding in the literals
AhmedMaher309 Jul 2, 2024
b73d8d3
National literals space padding is now handled
AhmedMaher309 Jul 2, 2024
3b777ff
removed the build directory
AhmedMaher309 Jul 3, 2024
d846988
added a debug line to build_literal function
AhmedMaher309 Jul 4, 2024
9d08bca
Revert "added a debug line to build_literal function"
AhmedMaher309 Jul 6, 2024
a43bb00
made the encoding conversions in the two function that calls build_li…
AhmedMaher309 Jul 6, 2024
d5b040a
modify the code to make iconv_open(3) only twice when cobc starts
AhmedMaher309 Jul 6, 2024
bb9ead6
Refactor the valid_move function
AhmedMaher309 Jul 7, 2024
5b29a10
refactor validate_move function
AhmedMaher309 Jul 9, 2024
cc0c49f
refactored scan_x function
AhmedMaher309 Jul 14, 2024
781bf47
fixed indentation
AhmedMaher309 Jul 15, 2024
a8ad25e
Fix indentation
AhmedMaher309 Jul 15, 2024
f784511
refactored scanner.l/scan_x and typeck.c/trimmed size
AhmedMaher309 Jul 15, 2024
781e46f
refactor scan_x to handle the errors correct
AhmedMaher309 Jul 17, 2024
94a4eb1
refactoring scan_x (not finished)
AhmedMaher309 Jul 17, 2024
9c7a280
refactor scan_x
AhmedMaher309 Jul 19, 2024
36fba80
fixed the scan_x function and solved the failed tests that use it
AhmedMaher309 Jul 19, 2024
6ee58b6
added the errno cases to the literals builders
AhmedMaher309 Jul 27, 2024
d3b1b92
install cobc
AhmedMaher309 Jun 26, 2024
a1d20ec
update tree.c/build_literal
AhmedMaher309 Jun 26, 2024
3cc645d
refactor the build_literal function
AhmedMaher309 Jun 26, 2024
9d3cfc4
Delete .local directory
AhmedMaher309 Jun 26, 2024
ad74341
remove configs
AhmedMaher309 Jun 26, 2024
327d82d
added switch cases on category to build_literal fucntion
AhmedMaher309 Jun 27, 2024
e9011f0
Handling the case of N'text' in National literals, but not space padd…
AhmedMaher309 Jun 30, 2024
e1781b1
handling N'text' case in National literals but no space padding conve…
AhmedMaher309 Jun 30, 2024
245c048
added the parts of handling the padding in the literals
AhmedMaher309 Jul 2, 2024
56a170a
removed the build directory
AhmedMaher309 Jul 3, 2024
641f231
added a debug line to build_literal function
AhmedMaher309 Jul 4, 2024
3462d8d
Revert "added a debug line to build_literal function"
AhmedMaher309 Jul 6, 2024
b2d0299
made the encoding conversions in the two function that calls build_li…
AhmedMaher309 Jul 6, 2024
1b19b11
Refactor the valid_move function
AhmedMaher309 Jul 7, 2024
3e23a11
refactored scan_x function
AhmedMaher309 Jul 14, 2024
ff901ac
created a new function "cb_build_alphanumeric_for_figurative_constant…
AhmedMaher309 Jul 28, 2024
63633a9
added include guards to iconv.h and errno.h and used them with litera…
AhmedMaher309 Jul 28, 2024
4137550
refactoring the literals builders
AhmedMaher309 Jul 30, 2024
5fccee3
added tree.c/cb_build_UTF8_literal to handle the utf8 literals and us…
AhmedMaher309 Jul 31, 2024
b34b927
added guards around the iconv usages
AhmedMaher309 Jul 31, 2024
e5c1164
refactoring scanner.l and adding guards around iconv
AhmedMaher309 Jul 31, 2024
5e1758a
updated the cobc/Changelog
AhmedMaher309 Aug 1, 2024
c827f2e
refactoring: moving the declarations to the beginning of the function…
AhmedMaher309 Aug 3, 2024
fb224c2
created a new function cb_build_gcos_literal and used it with the gco…
AhmedMaher309 Aug 3, 2024
a7a651a
fixed the testcase in syn_defintion and the refactored typeck.c/valid…
AhmedMaher309 Aug 4, 2024
11ccdb2
merged cb_build_alphanumeric_literal , cb_build_national_literal, and…
AhmedMaher309 Aug 5, 2024
3fe593c
Revert "merged cb_build_alphanumeric_literal , cb_build_national_lite…
AhmedMaher309 Aug 6, 2024
f2d146f
refactor typeck.c/validate_move
AhmedMaher309 Aug 7, 2024
3bbf689
refactor scanner.l/scan_x
AhmedMaher309 Aug 7, 2024
cf306be
refactor scanner.l
AhmedMaher309 Aug 7, 2024
0a3f455
merged cb_build_alphanumeric_for_figurative_constant and cb_build_gco…
AhmedMaher309 Aug 7, 2024
36b7cd3
added poor-man's conversion for National literals in case of the abse…
AhmedMaher309 Aug 9, 2024
1dbbb4b
added the command line argument for the source encoding
AhmedMaher309 Aug 9, 2024
f0ac621
changed the default source encoding from UTF-8 to ISO-8859-15
AhmedMaher309 Aug 10, 2024
1ca9a64
refactor the command line argument for source encoding
AhmedMaher309 Aug 10, 2024
800481b
reverted the changes for validate_alphabet
AhmedMaher309 Aug 10, 2024
97b48b0
reverted changes in syn_definition.at
AhmedMaher309 Aug 10, 2024
b185d26
Revert "reverted changes in syn_definition.at"
AhmedMaher309 Aug 10, 2024
1c0e4c6
Revert "update test 132"
AhmedMaher309 Aug 10, 2024
2589ee4
Revert "fixed the testcase in syn_defintion and the refactored typeck…
AhmedMaher309 Aug 10, 2024
64f44d4
code refactoring
AhmedMaher309 Aug 10, 2024
196872a
adding the test for utf8
AhmedMaher309 Aug 10, 2024
50d90e5
refactoring tree.c/cb_build_UTF8_literal and cobc.c/process_command_line
AhmedMaher309 Aug 11, 2024
2e8d4a6
refactor tree.c/cb_build_national_literal
AhmedMaher309 Aug 12, 2024
a98d666
added the option to use the locale if the source encoding is not sp…
AhmedMaher309 Aug 12, 2024
0e5088c
replaced spaces indentation with tabs
AhmedMaher309 Aug 12, 2024
217d418
fixed cobc.c/initialize_cb_iconv
AhmedMaher309 Aug 12, 2024
7f58d1b
fix cobc.c/initialize_cb_iconv
AhmedMaher309 Aug 12, 2024
9274e03
updated cobc/Changelog
AhmedMaher309 Aug 13, 2024
a1f5a21
updated NEWS file
AhmedMaher309 Aug 13, 2024
acbb10e
add temporal (per file->UTF8 BOM) override
AhmedMaher309 Aug 14, 2024
78e5b35
refactored command line option for source encoding
AhmedMaher309 Aug 14, 2024
e61274d
added test for source encoding command line argument
AhmedMaher309 Aug 14, 2024
396a049
updated Changelog
AhmedMaher309 Aug 15, 2024
af26d94
added a chapter for encoding handling
AhmedMaher309 Aug 18, 2024
e23a8d6
modified the chapter for encoding handling in doc/gnucobol.texi
AhmedMaher309 Aug 18, 2024
3f9a1e7
updated doc/gnucobol.texi
AhmedMaher309 Aug 21, 2024
80efab6
fixed doc/gnucobol.texi
AhmedMaher309 Aug 21, 2024
92415e1
added new accepted encoding for source file in cobc.c/process_command…
AhmedMaher309 Aug 21, 2024
47485d0
update gnucobol.texi file
AhmedMaher309 Sep 4, 2024
dfa9872
minor adjustments to character encoding code
GitMensch Sep 9, 2024
286c7ef
cleanups to cobc:
GitMensch Sep 13, 2024
592deca
added command line option for the alphanumeric literals encoding
AhmedMaher309 Sep 29, 2024
d5eca86
fixing build and test issues
AhmedMaher309 Sep 29, 2024
2cd946a
fixing typeck/validate_alphabet
AhmedMaher309 Sep 29, 2024
5a57502
fixed build errors
AhmedMaher309 Sep 29, 2024
cfc4d85
install cobc
AhmedMaher309 Jun 26, 2024
845037b
update tree.c/build_literal
AhmedMaher309 Jun 26, 2024
1053ead
refactor the build_literal function
AhmedMaher309 Jun 26, 2024
362e616
Delete .local directory
AhmedMaher309 Jun 26, 2024
6d288e2
remove configs
AhmedMaher309 Jun 26, 2024
629a36a
added switch cases on category to build_literal fucntion
AhmedMaher309 Jun 27, 2024
d375186
Handling the case of N'text' in National literals, but not space padd…
AhmedMaher309 Jun 30, 2024
31a4548
handling N'text' case in National literals but no space padding conve…
AhmedMaher309 Jun 30, 2024
9491db5
added the parts of handling the padding in the literals
AhmedMaher309 Jul 2, 2024
a9abdce
removed the build directory
AhmedMaher309 Jul 3, 2024
c1b5f45
added a debug line to build_literal function
AhmedMaher309 Jul 4, 2024
736cb57
Revert "added a debug line to build_literal function"
AhmedMaher309 Jul 6, 2024
b5b1bb3
made the encoding conversions in the two function that calls build_li…
AhmedMaher309 Jul 6, 2024
d533621
modify the code to make iconv_open(3) only twice when cobc starts
AhmedMaher309 Jul 6, 2024
8ce5ae7
Refactor the valid_move function
AhmedMaher309 Jul 7, 2024
40fb88d
refactored scan_x function
AhmedMaher309 Jul 14, 2024
a4884df
fixed indentation
AhmedMaher309 Jul 15, 2024
f989b3b
added the errno cases to the literals builders
AhmedMaher309 Jul 27, 2024
cd348bb
install cobc
AhmedMaher309 Jun 26, 2024
5499c60
update tree.c/build_literal
AhmedMaher309 Jun 26, 2024
4fd32e8
refactor the build_literal function
AhmedMaher309 Jun 26, 2024
a7a0b00
Delete .local directory
AhmedMaher309 Jun 26, 2024
69d013d
remove configs
AhmedMaher309 Jun 26, 2024
fd4d1ef
added switch cases on category to build_literal fucntion
AhmedMaher309 Jun 27, 2024
69fb0c0
Handling the case of N'text' in National literals, but not space padd…
AhmedMaher309 Jun 30, 2024
e397f4f
handling N'text' case in National literals but no space padding conve…
AhmedMaher309 Jun 30, 2024
ca3adab
added the parts of handling the padding in the literals
AhmedMaher309 Jul 2, 2024
58327f0
removed the build directory
AhmedMaher309 Jul 3, 2024
30aaaef
added a debug line to build_literal function
AhmedMaher309 Jul 4, 2024
c1c3f05
Revert "added a debug line to build_literal function"
AhmedMaher309 Jul 6, 2024
40a04b7
made the encoding conversions in the two function that calls build_li…
AhmedMaher309 Jul 6, 2024
e3c59cc
Refactor the valid_move function
AhmedMaher309 Jul 7, 2024
44d306e
created a new function "cb_build_alphanumeric_for_figurative_constant…
AhmedMaher309 Jul 28, 2024
03d8e20
added include guards to iconv.h and errno.h and used them with litera…
AhmedMaher309 Jul 28, 2024
d8f5ce3
refactoring the literals builders
AhmedMaher309 Jul 30, 2024
01743b3
added guards around the iconv usages
AhmedMaher309 Jul 31, 2024
0b8ae2a
refactoring: moving the declarations to the beginning of the function…
AhmedMaher309 Aug 3, 2024
91ead86
created a new function cb_build_gcos_literal and used it with the gco…
AhmedMaher309 Aug 3, 2024
94c16d7
fixed the testcase in syn_defintion and the refactored typeck.c/valid…
AhmedMaher309 Aug 4, 2024
848146c
merged cb_build_alphanumeric_literal , cb_build_national_literal, and…
AhmedMaher309 Aug 5, 2024
62e8245
Revert "merged cb_build_alphanumeric_literal , cb_build_national_lite…
AhmedMaher309 Aug 6, 2024
edde333
refactor typeck.c/validate_move
AhmedMaher309 Aug 7, 2024
cd1b3f9
refactor scanner.l/scan_x
AhmedMaher309 Aug 7, 2024
32b50aa
refactor scanner.l
AhmedMaher309 Aug 7, 2024
1d2b225
merged cb_build_alphanumeric_for_figurative_constant and cb_build_gco…
AhmedMaher309 Aug 7, 2024
cf63837
added the command line argument for the source encoding
AhmedMaher309 Aug 9, 2024
a65c5b0
changed the default source encoding from UTF-8 to ISO-8859-15
AhmedMaher309 Aug 10, 2024
beed135
refactor the command line argument for source encoding
AhmedMaher309 Aug 10, 2024
4cb65f5
reverted the changes for validate_alphabet
AhmedMaher309 Aug 10, 2024
b29282e
code refactoring
AhmedMaher309 Aug 10, 2024
2006da4
adding the test for utf8
AhmedMaher309 Aug 10, 2024
c97f8d2
replaced spaces indentation with tabs
AhmedMaher309 Aug 12, 2024
576a0a6
refactored command line option for source encoding
AhmedMaher309 Aug 14, 2024
9f5df14
added a chapter for encoding handling
AhmedMaher309 Aug 18, 2024
4e7e9ee
modified the chapter for encoding handling in doc/gnucobol.texi
AhmedMaher309 Aug 18, 2024
047ac11
updated doc/gnucobol.texi
AhmedMaher309 Aug 21, 2024
c133620
fixed doc/gnucobol.texi
AhmedMaher309 Aug 21, 2024
a1a8fc3
update gnucobol.texi file
AhmedMaher309 Sep 4, 2024
c2b488d
minor adjustments to character encoding code
GitMensch Sep 9, 2024
5ecd83a
cleanups to cobc:
GitMensch Sep 13, 2024
9716d84
added command line option for the alphanumeric literals encoding
AhmedMaher309 Sep 29, 2024
79ae168
added command line option for the alphanumeric literals encoding
AhmedMaher309 Sep 29, 2024
8126cb3
fixing build and test issues
AhmedMaher309 Sep 29, 2024
11ed412
fixing typeck/validate_alphabet
AhmedMaher309 Sep 29, 2024
6691015
fixed build errors
AhmedMaher309 Sep 29, 2024
78b4b30
removed yywrap function
AhmedMaher309 Dec 31, 2024
d9e22c6
fixed the config directory
AhmedMaher309 Jan 11, 2025
01aab23
fixed merge conflicts
AhmedMaher309 Jan 19, 2025
b95e404
fix merge errors
AhmedMaher309 Jan 19, 2025
44091e6
fixing merge conflicts
AhmedMaher309 Jan 19, 2025
a9b05d2
fixing merge conflict error
AhmedMaher309 Jan 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion ChangeLog
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
2022-03-20 Nicolas Berthier <[email protected]>

* .github/workflows: improve CI setup

2022-02-04 David Declerck <[email protected]>

* config, cobc/cobc.h: add support for GCOS 7 dialect (Bull), both
a strict variant (config/gcos-strict.conf) and a variant with
GNUCobol extensions (config/gcos.conf)

2024-12-08 Simon Sobisch <[email protected]>

Expand Down Expand Up @@ -184,6 +193,10 @@

* configure.ac: dropped extra check for GCC as done internally

2022-09-30 Nicolas Berthier <[email protected]>

* .github/workflows: fix ubuntu CI setup

2022-09-08 Simon Sobisch <[email protected]>

* configure.ac: cleanup curses library check
Expand All @@ -201,6 +214,10 @@

* configure.ac: check for PDC_free_memory_allocations

2022-07-20 Nicolas Berthier <[email protected]>

* .github/workflows: fix handling of quotes in testuite artifact name

2022-07-07 Simon Sobisch <[email protected]>

* .github/workflows: CI now emits a coverage report artifact
Expand Down Expand Up @@ -239,7 +256,7 @@
2022-03-29 Simon Sobisch <[email protected]>

* configure.ac: dropped obsolete AC_PROG_CC_STD, AC_HEADER_STDC as
already included in AC_PROG_CC and assumed otherwise
already included in AC_PROG_CC and assumed otherwise

2022-03-11 Simon Sobisch <[email protected]>

Expand Down
3 changes: 3 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ NEWS - user visible changes -*- outline -*-

* New GnuCOBOL features

** support different literal encoding according to literal type (UTF-8 for U literals,
UTF-16 for N literals, and ISO-8859-15 for Alphanumeric literals)

** cobc now checks for binary and multi-byte encoded files and early exit
parsing those; the error output for format errors (for example invalid
indicator column) is now limitted to 5 per source file
Expand Down
2 changes: 1 addition & 1 deletion THANKS
Original file line number Diff line number Diff line change
Expand Up @@ -110,4 +110,4 @@ David Essex
Jeff Smith
Jim Noeth
Stephen Connolly
Laura Tweedy
Laura Tweedy
46 changes: 46 additions & 0 deletions cobc/ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@
* typeck.c (cb_emit_move, cb_emit_set_to): do not check for incompatible
data if no receiver field is of category numeric or numeric edited

2024-09-09 Simon Sobisch <[email protected]>

* cobc.c, codegen.c, tree.c: fixed C89 errors
* typeck.c (trimmed_size, is_blank): minor refactorings
* typeck.c: provide the original buffer if encoding cannot be applied

2024-09-03 Simon Sobisch <[email protected]>

* typeck.c (cb_emit_accept): always check position
Expand All @@ -76,6 +82,21 @@
* tree.c (char_to_precedence_idx, get_char_type_description, valid_char_order):
adjusted size of precedence table and gave proper precedence to U

2024-08-27 Simon Sobisch <[email protected]>

* cobc.c, codegen.c, tree.c: fixed C89 errors
* typeck.c (trimmed_size, is_blank): minor refactorings

2024-08-10 Ahmed Maher <[email protected]>

* cobc.c: added the locale option to specify the source
file encoding if not specified with command line argument

2024-08-10 Ahmed Maher <[email protected]>

* cobc.h, cobc.c, flag.def: added command line argument for setting
source file encoding

2024-08-06 Simon Sobisch <[email protected]>

* codegen.c (output_alphabet_name_definition): cater for national alphabet
Expand All @@ -89,6 +110,14 @@
national alphabets with increased size (max. 65535 instead of 255)
* typeck.c (validate_alphabet): check that alphabet and literal types match


2024-08-1 Ahmed Maher<[email protected]>

* scanner.l, cobc.h, cobc.c, tree.c: added guards around iconv.h header and
used it around the usages of the headers functions
* tree.c (cb_build_utf8_literal): new function for UTF-8 literals
* scanner.l (read_literal): changed to handle the utf8 literal reading

2024-07-29 Chuck Haatvedt <[email protected]>

* tree.c (cb_build_picture): added logic to find the valid floating
Expand All @@ -103,6 +132,10 @@
* tree.c (char_to_precedence_idx): changed to check penultimate and last
symbols before the first and second symbols

2024-07-27 Ahmed Maher <[email protected]>

* scanner.l: refactor scan_x function

2024-07-26 Simon Sobisch <[email protected]>

* parser.y, config.def: rewrite ENVIRONMENT DIVISION parsing to allow
Expand All @@ -111,10 +144,23 @@
* reserved.c: make MENU context-sensitive
* reserved.c, parser.y: added MODAL + MODELESS to acu extension windows


2024-07-10 Chuck Haatvedt <[email protected]>

* tree.c (cb_build_picture): fixed currency scale counting logic

2024-07-09 Ahmed Maher <[email protected]>
James K. Lowden <[email protected]>

* typeck.c: refactor validate_move function

2024-07-06 Ahmed Maher <[email protected]>

* tree.c: added encoding conversion support to cb_build_national_literal and
cb_build_alphanumeric_literal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent by two spaces

* cobc.h: declared struct cb_iconv used for encoding conversion
* cobc.c: defined the struct cb_iconv with default values

2024-06-19 David Declerck <[email protected]>

* cobc.c (process_compile): fix MSVC build command
Expand Down
114 changes: 113 additions & 1 deletion cobc/cobc.c
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,11 @@
#include <io.h>
#endif


#ifdef HAVE_LANGINFO_CODESET
#include <langinfo.h>
#endif

#ifdef HAVE_LOCALE_H
#include <locale.h>
#endif
Expand Down Expand Up @@ -116,6 +121,9 @@ enum compile_level {
#define CB_FLAG_GETOPT_DEPEND_ADD_PHONY 24
#define CB_FLAG_GETOPT_DEPEND_KEEP_MISSING 25
#define CB_FLAG_GETOPT_DEPEND_ON_THE_SIDE 26
#define CB_FLAG_GETOPT_SOURCE_ENCODE 27
#define CB_FLAG_GETOPT_ALPHANUMERIC_ENCODE 28


/* Info display limits */
#define CB_IMSG_SIZE 24
Expand Down Expand Up @@ -327,6 +335,39 @@ cob_u32_t optimize_defs[COB_OPTIM_MAX] = { 0 };

int cb_flag_alt_ebcdic = 0;

#ifdef HAVE_ICONV
struct cb_iconv_t cb_iconv;

static void
initialize_cb_iconv() {
#ifdef HAVE_LANGINFO_CODESET
char * encoding;
char * locale = setlocale (LC_CTYPE, "");
if( ! locale ) {
cobc_err_msg ("could not set locale");
return;
}
encoding = nl_langinfo(CODESET);

if (encoding != NULL && *encoding > 0) {
strncpy (cb_iconv.source, encoding, sizeof(cb_iconv.source) - 1);
cb_iconv.source[sizeof(cb_iconv.source) - 1] = '\0';
} else {
strncpy (cb_iconv.source, "ISO-8859-15", sizeof(cb_iconv.source) - 1);
cb_iconv.source[sizeof(cb_iconv.source) - 1] = '\0';
}
#else
strncpy(cb_iconv.source, "ISO-8859-15", sizeof(cb_iconv.source) - 1);
cb_iconv.source[sizeof(cb_iconv.source) - 1] = '\0';
#endif
/* set the alphanumeric_source encoding to a default value
to avoid converting in cb_build_alphanumeric
if it didn't change by the command line*/
strncpy (cb_iconv.alphanumeric_source, "NONE", sizeof(cb_iconv.alphanumeric_source) - 1);
cb_iconv.alphanumeric_source[sizeof(cb_iconv.alphanumeric_source) - 1] = '\0';
}
#endif


/* Basic memory structure */
struct cobc_mem_struct {
Expand Down Expand Up @@ -3959,6 +4000,62 @@ process_command_line (const int argc, char **argv)
}
break;

case CB_FLAG_GETOPT_SOURCE_ENCODE:
/* -fsource-encode=encoding*/
const char* valid_encodings[] = {
"UTF-8",
"UTF8",
"ASCII",
"ISO-8859-1",
"ISO-8859-15",
"CP1525"
};
const int num_encodings = sizeof(valid_encodings) / sizeof(valid_encodings[0]);
int i, encoding_valid = 0;
for (i = 0; i < num_encodings; i++) {
if (strcmp(cob_optarg, valid_encodings[i]) == 0) {
encoding_valid = 1;
break;
}
}
if (encoding_valid) {
#ifdef HAVE_ICONV
strncpy(cb_iconv.source, cob_optarg, sizeof(cb_iconv.source) - 1);
cb_iconv.source[sizeof(cb_iconv.source) - 1] = '\0';
#endif
} else {
cobc_err_exit(COBC_INV_PAR, "-fsource-encode");
}
break;


/* -falphanumeric-encode */
case CB_FLAG_GETOPT_ALPHANUMERIC_ENCODE:{
const char* valid_encodings[] = {
"ASCII",
"ISO-8859-1",
"ISO-8859-15",
"CP1525"
};
const int num_encodings = sizeof(valid_encodings) / sizeof(valid_encodings[0]);
int i, encoding_valid = 0;
for (i = 0; i < num_encodings; i++) {
if (strcmp(cob_optarg, valid_encodings[i]) == 0) {
encoding_valid = 1;
break;
}
}
if (encoding_valid) {
#ifdef HAVE_ICONV
strncpy(cb_iconv.alphanumeric_source, cob_optarg, sizeof(cb_iconv.alphanumeric_source) - 1);
cb_iconv.alphanumeric_source[sizeof(cb_iconv.alphanumeric_source) - 1] = '\0';
#endif
} else {
cobc_err_exit(COBC_INV_PAR, "-falphanumeric-encode");
}
break;
}

case CB_FLAG_GETOPT_TTITLE: {
/* -fttitle=<title> : Title for listing */
const size_t len = strlen (cob_optarg);
Expand Down Expand Up @@ -9139,7 +9236,12 @@ static void
begin_setup_internal_and_compiler_env (void)
{
char *p;


/* initialize the default source encoding */
#ifdef HAVE_ICONV
initialize_cb_iconv();
#endif

/* register signal handlers from cobc */
cob_reg_sighnd (&cobc_sig_handler);

Expand Down Expand Up @@ -9557,6 +9659,16 @@ main (int argc, char **argv)
cobc_init_tree ();
#endif

/* initialize the iconv struct after reading the command line*/
#ifdef HAVE_ICONV
cb_iconv.alphanumeric = iconv_open(cb_iconv.alphanumeric_source, cb_iconv.source);
/* move iconv_open check here */

cb_iconv.national = iconv_open("UTF-16LE", cb_iconv.source);
cb_iconv.utf8 = iconv_open("UTF-8", cb_iconv.source);
#endif


/* Process input files */

/* Set up file parameters, if any are missing: abort */
Expand Down
22 changes: 22 additions & 0 deletions cobc/cobc.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,13 @@
#include <unistd.h>
#endif
#include <stdio.h> /* for FILE* */

#ifdef HAVE_ICONV
#include <iconv.h>
#endif

#include <errno.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errno.h should only be included in the files directly using it, not in the main header



#include "../libcob/common.h"

Expand Down Expand Up @@ -322,6 +329,21 @@ enum cobc_name_type {
PROGRAM_ID_NAME
};


#ifdef HAVE_ICONV
/* Structure for a iconv conversion */
struct cb_iconv_t{
iconv_t alphanumeric;
iconv_t national;
iconv_t utf8;
char source[100];
char alphanumeric_source[100];
};

extern struct cb_iconv_t cb_iconv;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move that line directly after the comment, use tabs for indenting within the structure, move complete part to tree.h if possible

#endif


/* Listing structures and externals */

#if 0 /* ancient OSVS registers that need special runtime handling - low priority */
Expand Down
37 changes: 35 additions & 2 deletions cobc/codegen.c
Original file line number Diff line number Diff line change
Expand Up @@ -5205,7 +5205,24 @@ output_initialize_to_value (struct cb_field *f, cb_tree x,
from a string - we use a local buffer to set that up */
unsigned char litbuff[128];
memcpy (litbuff + litstart, l->data, l->size);
memset (litbuff + padstart, ' ', padlen);

if (l->common.category == CB_CATEGORY_NATIONAL || l->common.category == CB_CATEGORY_NATIONAL_EDITED){
/* Calculate the number of bytes to pad (multiply by 2 for UTF-16) */
const size_t padbytes = padlen * 2;
size_t i;

/* Get a pointer to the starting position for padding */
unsigned char *padptr = litbuff + padstart;

/* Pad the rest of the string with UTF-16 space character */
for (i = 0; i < padbytes; i += 2) {
padptr[i] = 0x20;
padptr[i + 1] = 0x00;
Comment on lines +5219 to +5220
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as memset(p,0,size) is commonly optimized down to cpu-instructions, we may should use that to initialize the pad-memory first and only set the 0x20 bytes in the loop (similar in the "poor man's conversion)

}
} else{
memset (litbuff + padstart, ' ', padlen);
}

output_prefix ();
output ("memcpy (");
output_data (x);
Expand Down Expand Up @@ -5247,7 +5264,23 @@ output_initialize_to_value (struct cb_field *f, cb_tree x,
}

memcpy (litbuff + litstart, l->data, l->size);
memset (litbuff + padstart, ' ', padlen);

if(x->category == CB_CATEGORY_NATIONAL || x->category == CB_CATEGORY_NATIONAL_EDITED){
// Calculate the number of bytes to pad (multiply by 2 for UTF-16)
size_t padbytes = padlen * 2;

// Get a pointer to the starting position for padding
unsigned char *padptr = litbuff + padstart;

// Pad the rest of the string with UTF-16 space character
for (size_t i = 0; i < padbytes; i += 2) {
padptr[i] = 0x20;
padptr[i + 1] = 0x00;
}
}
else{
memset (litbuff + padstart, ' ', padlen);
}
Comment on lines +5268 to +5283
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the above code out to a static function before output_initialize_to_value and use it in both places?


buffchar = *(litbuff + size - 1);
n = 0;
Expand Down
Loading
Loading