Way back when, in the 1970's and 1980's, I programmed computers in COBOL (among other languages). COBOL, or Common Business Oriented Language, ruled the commercial computing world back then, with millions upon millions of lines of code written every year. Many of the older, so-called "legacy" systems that are the computing backbone of large businesses were written back then, and have been maintained and updated (with greater and greater difficulty) ever since.
I'll give you a brief example. In the early 1980's I wrote the so-called "price book" program for a big oil company in South Africa. This program calculated the current price for every single product they sold - something like 4,000 different items, at the time. I wanted to make it modular, with a separate program for every major group of products, to simplify maintenance and reduce the amount of time it would take to update the thing. I was told that would not be acceptable: I had to write a single monolithic program, calling in sub-programs and sub-routines whenever needed for specific product groups. I did as instructed, and produced a program with several thousand lines of code. I had to file detailed work flow diagrams for each major product section, showing which routines were called from outside and which of those routines served multiple product sections, so that a modification to one of the latter might affect one or more of the others (all of which cross-references had to be tested after every modification). I warned at the time that this might become a maintenance nightmare, due to its sheer size and the number of modifications and updates that might be required. I was told to shut up and get on with the job. I obeyed orders.
When last I checked, some years ago, that price book program had ballooned in size to well over double the original length, and a team of three programmers were kept busy doing nothing but maintenance updates. So many of the latter came in that one programmer would be finalizing an update and putting it into production, another would be midway through implementing another, and a third programmer would be picking an urgent change order out of the incoming pile to start yet another update. The various additions, edits, alterations and mutations had added so many odd lines and sections to the coding that my original flow charts were almost unusable.
That's COBOL for you. It was a very strong, stable business computing language, but it was clunky and maintenance-intensive. Trouble is, very few people use COBOL today. The old farts like myself who remember it are mostly retired and dying off, and the youngsters today want to use the "cool stuff" like C++ and other fancy tools. I'm informed that a skilled, experienced COBOL programmer can almost write his own ticket in data processing today, just to keep the legacy systems running. I know one formerly retired COBOL senior programmer who was tempted back to work with a package of well over $150K per year. He smiles all the way to the bank.
That being the case, Anthropic claims to have developed an AI system that can update legacy COBOL systems, convert them into modern coding languages, and replace the old systems with the new. This means it will have to read and understand COBOL (not always an easy task with rambling, much-modified legacy coding), break it down into more understandable chunks, rewrite each chunk in modern computer languages, test them for accuracy, produce easy-to-understand flow charts or other technical documentation for each of them, and gradually replace all that clunky stuff with modern software.
That ability is very long overdue. If it works (and I'll want to see lots of evidence of that before trusting its output), mainframe computer manufacturers such as IBM may be drastically affected, because modern codes and languages can run efficiently on much smaller computers. Modern computer hardware is being optimized for such use: see heterogeneous computing for more information. The Apple computer with its M-series chip that I'm using to write this article is just one example.
If you'd told me all those years ago that the COBOL code I was writing would still be in production use over half a century later, I'd have laughed at you. Guess COBOL and corporate inertia are having the last laugh . . .
Peter
The last programming language I ever used was FORTRAN. Used to be that most scientific and engineering code was FORTRAN. Nowadays? I have no idea what they use.
ReplyDeleteFor the most part, it's Python and/or MATLAB at least where I have worked in academia as a staff person. I did take FORTRAN as an EE in early 1990s and never used it.
DeleteI used Fortran in college and remember long lines at the index card reader to run the program.
DeleteCOBOL did a good job at what it was originally designed for. It's possible to write code that is clunky and maintenance-intensive in just about any language. Anyone claiming otherwise is ignoring human ingenuity when it comes to being "original", as well as management's ability to turn molehills into mountains. One of my sundry jobs was COBOL programmer for a reporting team. Management's insatiable demand for more and more reporting...nightmare material.
ReplyDeleteThe largest problem with updating any legacy system is the lack of a robust test suite.
ReplyDeleteNo doubt there wasn't any such thing for your system, because higher would have seen it as a waste of time and money (like all the other things you suggested).
The first thing they should do with AI on any legacy system is to generate such a test suite. You *must* validate it, but that's like checking any AI-written code; it's probably much faster than writing the code in the first place.
Though tracking down what's supposed to happen on seldom-executed code paths may take a while.
Once you have confidence that you know just how the system is currently *supposed* to behave, it's much easier to change it. Either with AI or with humans writing the code.
COBOL systems definitely need an update but to make it work you're going to have to set up a parallel architecture for Anthropic to test.
ReplyDeleteYou can't screw up financial records and just shrug and move on and even then there's a huge set of subsidiary systems and companies that work as pass-through to COBOL mainframes.
Hard to understate the changes and shifts this will cause, even if you had faith in banking IT which you shouldn't.
I spent year maintaining programs in IBM Mainframe Assembler. One speed-bump to the AI verification suite that I can see is that the source code can tell you what the program does--it cannot tell you why it does it. Legacy systems may be jumping through hoops to fulfil a requirement for a client that is long gone. Another (still current) client may be depending on some feature that is an apparent error. I can recall sitting-in on a discussion of such a case. The higher-ups had found a case in a piece of code where of cut-off value seemed strangely chosen. I pointed out that (from the comments) this area of the code had not been updated in five years. Did ww really want to shoot from the hip to change this value? That cooled the urgency of this update quite a bit.
ReplyDeletethirty, forty years ago, we were saying that there were COBOL and Fortran programs running which had been compiled before the current batch of programmers were born.
ReplyDeleteIt is possible to write spaghetti code in any language, and conversely it is possible to write structured code too. Back in the 90s I was working on an aerospace project and we had to use FORTRAN, but I was senior enough to force us writing fully modular code. We had separate code blocks for orbital mechanics, solar array performance (w thermals), Battery performance, and power distribution.
ReplyDeleteIt sounds like your manager at the time was more interested in protecting his turf than following proper coding practices.
Oddly specific, yet I have seen Spacecraft Energy Balance done in VBA, broken up the same way.
DeleteThat's how I was taught to code back in the late 80's, early 90's. That seems to have been forgotten / thrown out. We are using a new(er) construction management software system and the useless id10t "programmers" tell us part of the reason things take so long to get done is "they don't have a rounding function, so they have to make one every time". It's python, while the base round() function was made for accounting and thus "round to even" and we are doing engineering so federal statute says we round they way everyone is taught in grad school. Any competent programmer, programming manager, or programming company should have their own internal libraries so everyone is doing the same things the same ways. Hell, when I was first being taught programming my freshman year of high school (basic on an Apple IIC) I couldn't call in outside libraries in that system but I had a file of all my subprograms I just copied and pasted into whatever I was trying to do. My rule was anything I had to do more then twice in a program got turned into a subroutine or sub-program.
DeleteI retired last Summer from a software firm I had been with for 25 years. Most of the code I wrote for them was modular (i.e. Object Oriented).
ReplyDeleteJust before they told me my services were no longer needed, I was astonished to learn that others of the software team were preparing to update several HUNDRED pages of the web app because they wanted to change the way users were selected.
"What? We wrote an object you could drop onto any page to select users. It created a common user interface that was easy to maintain."
Oh, we replaced that ages ago!
It was a good time for me to retire.
I sure wouldn't let an AI rewrite the programming for any critical systems without an extensive dry run test in an isolated environment.
DeleteMy engineering software package that I write and sell for a living is 800,000 lines of Fortran 77 code and 650,000 lines of C++ code. I never could get my Fortran 77 code to work with the Fortran 90 compilers so I have been limping along on a 1990 F77, C, and C++ compiler from Watcom. I am now biting the bullet and converting my Fortran 77 code to C++ using a customized version of F2C that I modified to do about 80% of the work for me. I have about 150,000 lines of the F77 code converted to C++ now and am converting 5 to 10 subroutines a day when I do not have distractions known as customers. The hardest part is the indexing of arrays starts at one in F77 and starts at zero in C++. And the character strings are a total rewrite.
ReplyDeleteSadly, the 1990 update for the Fortran language started the path of converting from a professional language to a hobbyist language. Every time I mention this, the Fortran language lawyers start screaming like banshees. None of them know what is like working with over a million lines of code in your desktop software.
ReplyDeleteA similar situation happened with me. I wrote so much COBOL code for a fortune 500 company that it isn't funny. I once wrote a very modular, sophisticated system that hit all three of my self-imposed marks. First, it worked as intended. Second, it was elegant to look at. Third, it was several orders of magnitude faster than the system it replaced. Management didn't like my singular attempt at efficient code and claimed the lesser souls that would come after me would not be able to understand it. It's FREAKING COBOL! It's written in plain English with lots of comments and very descriptive variables. I, too, was forced to turn a tight, fast system into a bloated behemoth that the least common denominator with a 2.0 GPA could maintain.
ReplyDeleteFor quite some time now, I have been telling kids interested in learning to program for a living that they should learn the old legacy languages like COBOL and RPG. I tell them there is more job security working with those languages on mainframe systems than there is working on a PC with whatever the current trendy language is.
ReplyDeleteThe problem with the conversion of all of the old legacy software isn't so much the programming language itself, but the entire operating environment. One of the hurdles in converting from an IBM mainframe is the IMS database if it is used. It is quite unique and does not operate anything like SQL. I did manage to write a program to convert an IMS database dump to an SQL database but the results were messy because it was like turning an apple into an orange, or pounding a square peg into a round hole.
The problem with COBOL isn't the language syntax, it's tracking the logic of the program to make changes. Mechanically translating from COBOL to a new language won't fix the logic or add documentation to explain the meaning of what's happening.
ReplyDeleteIt may be that AI can clean up the logic to make it clearer as to what's happening, but it can't add comments to explain what the programmers intended to do (all it can do is explain what the code actually does, and comments like that have very low value)
This may help, if only that so many programmers are afraid of COBOL and would look at this if it was in a more popular language, but it is unlikely to end up requiring fewer programmers to maintain the code going forward.
Slashdot has a very good discussion on this.
ReplyDeleteSteve
I was probably about the last of the CIS majors at my community college to have to learn COBOL and RPG in about 2000-2001. They had kept them in the curricula due to fears of the "Y2K bug" situation. A year or two later I visited the department and the old monochrome "green screens" and the AS/400 had been replaced with Windows desktops and a Windows server rack. I actually liked writing programs in COBOL better than C, even though a few memorable times I messed up a loop-termination condition and had to run and pound on the glass room withe AS/400 to get the operator to terminate my endless loop!
ReplyDeleteAh, yes. Remember the Devil's Dictionary of Data Processing?
DeleteINFINITE LOOP: See LOOP.
LOOP: See INFINITE LOOP.
:-)
Any demand today for those who can program 80 column card sorters? I'm available.
ReplyDeleteHeh. Remember the good ol' days, when your program was on thousands of punched cards in multiple boxes, and if you dropped the box and had to get them back into order, you were better off resigning before they found out?
DeleteJPL is still using OPM Cobol for their AS400s which are still used to talk to Voyager! Those programmers are OLD!!!
ReplyDeleteI kept hoping they would make an Object Oriented version of COBOL, named ADD ONE TO COBOL
ReplyDeleteADD ONE TO COBOL GIVING COBOL_PLUS_PLUS
DeleteSteve
In 2005 the team I worked with got the job of writing a COBOL-to-Java transpiler - at that time we had a group of managers very much in love with technology and they came with the brilliant idea. It didn't matter that this type of transpiler would produce sequential code written in Java, which is as useful as hammering nails using a microscope in order to say you used top of the line technology. Refactoring the resulting code was a nightmare because of the lack of documentation and also the less than stellar way the original COBOL was written, but in 2007 we had a running prototype which we could use in a very limited way. That's the moment the management realized that a migration would be to costly anyway so they killed the whole project. About a year later the open source community came with NACA, which was a reasonably effort in the same direction which we could have used to give the management the same idea on how costly a migration wold be. Talk about Sisyphus type of assignments!
ReplyDeleteA standard, once established, lasts forever.
ReplyDelete"High level" programming languages were created in the late 1950's.
The first was FORTRAN. It's still in use.
The second was COBOL. It's still in use.
The third was LISP. It's still in use.
The fourth was ALGOL. It was never really used. However, 99% of the languages created since then are based on it.
C# << Java << C++ << C << B << BCPL << CPL << ALGOL.
Think COBOL skills are $$? BAL skillz are twice as $$!!
ReplyDeleteAh, yes: 'fixing COBOL.' Anyone remember CASE tools??
I learned COBOL from Col. John Pickett. He was on Grace Harper's team back in the 50's.
ReplyDeleteI'm a bit younger but in high school we programmed in Pascal. Then in college we used pascal, as well as some weird language for cnc machines. Kind of glad i chose welding vs programing.
ReplyDeleteExile1981
> "High level" programming languages were created in the late 1950's.
ReplyDelete> The first was FORTRAN. It's still in use.
> The second was COBOL. It's still in use.
> The third was LISP. It's still in use.
> The fourth was ALGOL. It was never really used. However, 99% of the languages created since then are based on it.
My father programmed an IBM 360 in the 1960s for Shell for controlling a refinery catalytic cracker using Algol for his PhD thesis in Chemical Engineering from Princeton. The code was moved to Fortran in the 1970s for the IBM 370.