You may have heard COBOL before. If you search for it you will find images like this:
If you are still reading this on medium I moved my blog to http://yvanscher.com/blog.html
This is a picture of a COBOL program editor running in a mainframe. Below we will go over 7 examples to COBOL (COmmon Business Oriented Language). We?ll be running these programs on Linux. We are not going cover mainframe tutorials here; there?s a really good tutorial on mainframes here and I?ve added some mainframe resources at the end.
How to Install the GnuCobol compiler
This compiler transpiles COBOL to C bytecode that can run on your linux bash command line. Not all the features of COBOL are supported but most are.
Run (to install):
sudo apt-get install open-cobol
How to write a Program
We will write a simple program in cobol called ?hello.cbl?. There are a lot of strange keywords in cobol. I will explain them after compilation.
How to Compile and run
Create the runable bytecode file with the instructions below. This transpiles our COBOL program called ?hello.cbl? to C then it takes the C and produces an executable object/bytecode file called ?hello?.
Compile and then run with:
cobc -x -o hello hello.cbl
./hello
Should output:
WILLKOMMEN
Understanding the Program Structure
First and foremost to comment in cobol use the *> characters. In a cobol program there are several possible divisions. A division is just a way to break up the program into areas responsible for different things. So IDENTIFICATION DIVISION is responsible for identifying the program (docs). We are only going to use the PROGRAM-ID keyword, giving our program a name, to keep it simple. The DATA DIVISION is a place where we can declare variables we want to use in our program, we will use the WORKING-STORAGE keyword (docs). The PROCEDURE DIVISION is like the main function of our program. It actually runs our code and can access anything defined in the data division (docs). The STOP RUN sentence (sentence=one or more ?statements? that ends with a ?.?) exits the program (like sys.exit() in python). You will also notice 6 spaces on the left of all my programs. This is not a mistake; the compiler expects this and on a mainframe these 6 spaces would be used for line numbers. Also: EVERYTHING IN COBOL IS CAPITALIZED SO ITS OFTEN EASIER TO TYPE WITH CAPSLOCK ON. You may notice the dot/period at the end of some lines. This is how you end a sentence (a series of one or more statements) in cobol. Below I may refer to anything that ends with a period as a statement. Ok so now that we have gotten through the basics of the program structure let?s write some programs.
Declaring Variables
I will write a script below that explains how to declare and print variables. We will declare several variables in the data division (FIRST-VAR, SECOND-VAR, etc) and then print them in the procedure division using DISPLAY.
PIC stands for picture (not sure why it is called this) and it is a keyword we use to define a variable. We use functions of the form type(elements). So 9(3) would correspond to laying aside enough room in memory for storing a number with 3 values. Above we define many variables and then print them out. You may also notice the 01 values before each declaration, and the 05 value before sub variables. These numbers are called level numbers and indicate to cobol what kind of variable we are declaring. 01 is for top level variables, 05 is group level variables under some other variable. Below are the functions and what each data type above corresponds to.
9 ? numeric
A ? alphabetic
X ? alphanumeric
V ? decimal
S ? sign
A final example:
01 CAT-PEOPLE PIC X(15) VALUE ’12@4A!D$’.
Would create a variable called CAT-PEOPLE with space for 15 alphanumeric elements that only actually fills out 8 of them. You may have noticed that I do this above in the subgroup. Here is another data declaration resource.
Common Verbs
In cobol a verb is a keyword that does something (docs). We will cover the compute, divide, multiply, subtract, add, move, and initialize verbs. These are verbs you will use often in cobol programming to calculate, say the result of a business transaction.
compute ? can be used to do arithmetic and store the result in a variable
divide ? can be used to divide two numbers
multiply ? can be used to you guessed it, multiply
add ? adds two variables/numbers
move ? moves a value or reference from a variable into another variable.
initialize ? this is used above to reset a variable after its been set
Conditionals
In this section we will look at if/else statements and switch statements.
All this should be familiar to you if you have done any programming. You have your standard if/else, not/and/or operators, type comparisons, and switch statements. The only thing that might be a little weird are the pre-defined statements. What has essentially happened here is that the variable CHECK-VAL has these two conditions that depend on it (hence why PASS/FAIL are indented underneath CHECK-VAL) 88 is a special level number (like 01) for indicating that a statement is a custom conditional that depends on the 01 variable above it. You will also notice our STOP RUN is not indented here. This exits the whole program which is why you can indent or un-indent it from the procedure division.
Here are some examples using NOT/AND/OR as well as some other extras (imagine putting these into the procedure division above):
One last thing to notice: IF statements that do not have END-IF need a period to end them inside their last ?sub statement.?
String Handling
String handling in cobol is very verbose and requires a lot of typing. Let?s try it.
Tallying all or just specific characters is pretty clear. The replacing keyword is also pretty clear, it replaces specified data in the string with some other data. Whats really worth digging into here is the string concatenation and the splitting. In the STRING statement we pass in the original strings WS-STR2, WS-STR3, WS-STR1 and we use a DELIMITED BY to tell the string statement how to combine them. If we delimit by SIZE we are telling cobol to add the entire input string to the final string. If we delimit by SPACE we are saying to take the input string up to the first space and omit the rest. The INTO keyword tells us the variable (WS-STRING-DEST) where the resulting concatenated string will be stored. WITH POINTER here manages to count the things in the final string. So somehow putting a pointer to the string counts it as things are concatenated in. I think what happens is it sets a pointer to the beginning and as you push things into the final string it pushes the pointer down several locations which are then stored as a count. The ON OVERFLOW tells cobol what to do if the input strings are too large; here it prints/displays ?OVERFLOW!?
The string docs on mainframetechhelp are very useful for understanding this section.
Looping
We will now cover some of the looping logic in cobol. One thing I?ll mention before we get into it: We can name parts of the procedure division; these named parts, called paragraphs, can be used kind of similarly to functions or named lambda functions in python. In cobol a paragraph can contain many sentences/statements.
So I think the loops are pretty well explained above. You will notice here we set aside code to be called outside the procedure division. This is because we want to define these as things we can do but we don?t actually want to run them in the procedure division so we put them in paragraphs outside the procedure division. To do this each one needs to have a name that we can reference/use in the procedure division. So B-PARA-TIMES will only be run when its called in our loop on line 13.
Files
Files in cobol usually have a rigid structure like a table. This is because of what they were created for dealing with: well organized business data. There are a few kinds of files in cobol (docs, another example); we are going to deal with sequential files as they are the most basic. A sequential file consists of records (rows) and each record contains some number of fields (columns). Let?s get to the code.
So in cobol you need to specify the file and what kind of file it is in the INPUT-OUTPUT SECTION Then you need to specify what kind of records are in your file. Then you need to create such a record with the exact same structure. Then in the PROCEDURE DIVISION you open the file (see open modes for details). Then you when writing you specify what kind of record you are adding and the record itself. Here we have opened our file in OUTPUT mode which always re-creates a file when you open it even if it already exists. You can make writing a file take 100 lines of code in cobol so I tried to keep it as tight as I know how. This is a nice guide on sequential files in cobol. To understand the syntax of WRITE here are the ibm docs.
Resources for mainframe programming
If you are interested in legacy systems and cobol you will probably want to play around on a mainframe. They are hard to use and look something like this:
The below resources may be helpful:
https://medium.com/@bellmar/mainframe-on-the-macbook-51bc1806d869
https://www.youtube.com/watch?v=Uv7ThVwb7m8 (programming mainframe cobol)
http://www.csis.ul.ie/cobol/examples/default.htm (more cobol examples)
http://www3.sympatico.ca/bredam/GoodBadUgly.html (overview of cobol quirks)
https://devops.com/the-beauty-of-the-cobol-programming-language-v2/ (another programming tutorial with cobol)
https://github.com/mickaelandrieu/awesome-cobol (cobol software)
Conclusion
Cobol is interesting. I think it?s really fascinating that a language like this has been around since the 1950s in some form and to be honest it will probably be around for the foreseeable future. It?s probably useful for some folks to have a grasp on the basics. It has some obvious issues though; it is extremely verbose and the documentation is a bit scattered. If you are interested in getting more content like this you can signup for my newsletter Generation Machine. As always have a nice day.