The Daily WTF: Curious Perversions in Information Technology

2012-12-10 Reply Admin

There are worse ways to do that.

Steve The Cynic · 2012-12-10 Reply Admin

Dion:
There are worse ways to do that.

Much worse. The fact that it is done by generated code is a consequence of the fact that Java lacks the preprocessor of C/C++, and thus cannot populate string variables with __DATE__. A pathetic mis-use of article space.

1/10 could do almost infinitely better.

2012-12-10 Reply Admin

Actually, I'm doing essentially the same in Ant. I could be generating into a resource file instead of into a class file, but meh. Doesn't do any serious harm either, so you never come back and fix that, nor is it a real WTF. At best, a mini-WTF, a WTF-let :-)

2012-12-10 Reply Admin

+1 yes. I do just this - only in Perl invoked from the Ant build script. Mine is a little more elaborate, in that it pulls a file revision list from Perforce, then builds a "Version.java" for compilation into the project. In this way, even if your configuration control is all to pot, you can always find out what source was used to generate a particular program. It can also generate a "Version.html" for dropping into the "web" directory of a Tomcat project, to give a web page with the build info on it. OP (I think) needs a gentle introduction into the idea of configuration control being your friend, not a chore.

2012-12-10 Reply Admin

Wait...what..this doesn't seem like a WTF. It seems like good practice to automatically tag builds.

But maybe I've just gone too deep into old code and lost sight of the way back

2012-12-10 Reply Admin

The WTF is that this is a process known only to the lead architect and only accidentally discovered by Brian.

2012-12-10 Reply Admin

Visual Studio does much the same.

2012-12-10 Reply Admin

This is a WTF people.. Using a .properties file included in the build would make a lot more sense.

Code generation shouldn't be a part of your build unless it really is the only option.

Melnorme · 2012-12-10 Reply Admin

At the very least, couldn't the script create TagFile.java directly?

2012-12-10 Reply Admin

No, the appropriate thing is to add information to a jar's MANIFEST.MF file, and to access the information in code through calls to java.util.Package. But that would require the programmers to actually think.

Remy Porter · 2012-12-10 Reply Admin

Apparently the real WTF is that people think generating executable code as part of your build step is a good way to version a build.

I don't care what language you're using- version information lives in a non-executable file. If you want code modules that read that file for ease of use, great. A -version flag is a great thing.

Cbuttius · 2012-12-10 Reply Admin

I'm not expert enough to know how this is normally done in Java but it doesn't look the most WTF'y thing to me.

I know code generation was very popular a number of years ago, especially in the days before generics and even to some extent in C++. That it is generated automatically rather than hand-built for every project obviously makes it less of a WTF.

2012-12-10 Reply Admin

OK I'm not enough of a Java guy to know The Right Way To Do It, but if all this ultimately does is print a string, couldn't this at least be simplified to merely generate the code that prints the string? You don't need all those static attributes and getters.

As far as the concept of having your build script generate code, I don't have a fundamental problem with it. The more work you can automate, the better.

2012-12-10 Reply Admin

What's the WTF. Sure, doing it in a way native to the build system is preferrable if there is an easy way to do so, but there might not have been when this build system was created.

Remy Porter · 2012-12-10 Reply Admin

Because generating code is always superior to generating a human-readable data-file that can be parsed by stock code?

2012-12-10 Reply Admin

Looks like a great way to sneak arbitrary code into someone else's program! :)

That's what it's for right?

2012-12-10 Reply Admin

I used to have something like this before replacing it with Hudson -> Ant -> Manifest file generation with build/version/CVS details - works perfectly.

2012-12-10 Reply Admin

I do something similar with svn - I have a Perl program that gets the current build # and date, and generates a Java class with bean-like methods. Using Java to generate Java is overkill, but I can see this tag thing being written with the intent to expand it to svn later on.

2012-12-10 Reply Admin

Why a .properties? In a java JAR this info belongs to the META-INF/MANIFEST.MF file.

Steve The Cynic · 2012-12-10 Reply Admin

Remy Porter:
Apparently the real WTF is that people think generating executable code as part of your build step is a good way to version a build.
I don't care what language you're using- version information lives in a non-executable file. If you want code modules that read that file for ease of use, great. A -version flag is a great thing.

It's not versionning the build. It is identifying the build based on when it was built and by whom. That allows me to distinguish today's build from yesterday's build of the same sources (maybe I updated the compiler rather than the source code), and official builds (built by user "build") from unofficial builds (built by other users). If I change the compiler due to a fix for a bug, then I need to be able to say that all builds before date X must be treated with suspicion, and a build date stamp is the way to go. (No, it isn't the only way to do this, but there is a psychological advantage in having a date stamp rather than a build number. A date stamp is obviously tied to a particular real-world date, while a build number is not.)

Doing it by a two-stage compilation process is a bit arcane, but not noticeably worse than e.g. processing lex/yacc inputs at build time or compiling .IDL files into proxy-stub DLLs for old-style DCOM projects. If you use the "manifest.mf" approach suggested by another poster, you replace code generation with manifest generation (since the contents of the file vary from build to build), but you don't really dissipate much of the WTF. (Assuming there's much WTF there in the first place...)

Oh, and the "file" that contains the version information must be tightly bound to the code. A Windows .RC file, or the manifest.mf file in a jar, or the manifest information in a .NET assembly, or even a build-time-generated code file all serve this purpose. A random text file stashed alongside the executable code does not.

A major chunk of WTF in the case of the article is that the process is not adequately documented, but that may be a failing of the submitter, in that he didn't seek the info. (Or it might really be undocumented, in which case the lead architect deserves a good slapping.)

Tankster · 2012-12-10 Reply Admin

Alex:
This is a WTF people.. Using a .properties file included in the build would make a lot more sense.
Code generation shouldn't be a part of your build unless it really is the only option.

why not?

2012-12-10 Reply Admin

This is not a versioning system. Search for the word "version" in the code, you won't find it. It doesn't even assign a build number, which might be seen as the last digit of a version number.

This tags builds with who built it and when. I also am not Java-savvy enough to know, if there are better ways to achive this, but it seems to get the job done and it is not a replacement for whatever proper way of assigning versions to the product.

So what's the WTF exzctly?

Remy Porter · 2012-12-10 Reply Admin

Because there should be a direct mapping of the code source control and the code in the build. By adding a code-generation step to the build process, you break that connection.

Even Microsoft's tools generate code pre-compilation so that you actually check the generated code into source control. Now, in this case, someone was going back and checking these TagFile.java files back into source control, which is also pretty terrible, since they're going to be replaced every build, unless the process checks it back in automatically, and then you have this metadata file in source control that doesn't actually mean anything and argh.

I oversee the build process in my company, and I would never want to see something like this.

Remy Porter · 2012-12-10 Reply Admin

Date + builder sounds like a version to me, but that's needless pedantry. The issue isn't generating a file to document the build information, the issue is generating executable code to document the build information.

Instead, every project should include a "ReadVersion" class, which handles request for build version information, and the build process should generate a text file which ReadVersion can parse.

Now the code in source control matches exactly the code in the build, you've also got a human-readable fallback for build tagging.

caffiend · 2012-12-10 Reply Admin

AFAIK TeamCity does something remarkably similar if you're using it's "Patch Assembly Version" feature on a C# project. I think it modifies the AssemblyInfo.cs before compilation.

Though it seems like a goofy solution, it hasn't caused my team any trouble. It successfully stamps the build with the marketing department's arbitrary major version number and the version control system's revision number, which is a lot better than what happens without it.

Addendum (2012-12-10 08:26): If you're from oldskool Windows Land, i hope you'd agree it's a lot better than using the venerable "stampver" tool and having to make sure that everyone set the version string to "000.000.000.000" to avoid sometimes corrupting the PE header.

Oh the good old days

Addendum (2012-12-10 08:41): Which, looking back on it would have failed catastrophically when we got up to Version 101.605.212.45289012. How short sighted of us.

2012-12-10 Reply Admin

It seems simpler to add the data to the file from the class that gets it. Rather than generate an executable to give the information out. One of the files is bloat.

caffiend · 2012-12-10 Reply Admin

I don't think there is actually a good solution to this problem on any platform. Since I'm not experienced with commercial software development on platforms other than windows, I won't pretend to be an expert. AFAIK, no other platform attempts to embed version information in an executable in a standardized way (but i could be wrong, but nobody has said so yet).

On windows, the PE header uses a string to store the version information. The length of the string is determined at compile time, and can it's contents can be modified after the fact, so long as the resultant string is not bigger than the string that was dimensioned at compile time. If it is bigger, random bad stuff happens when your version string overwrites something useful in the bytes just after it (but not always, depending on whether or not your version string ended on a 4 byte boundary it is padded with zeros).

If you're using .NET, there are two version numbers (who knows why), the assembly version number (which is a .NET metadata thing) and the PE version number, which sits in the traditional spot in the PE header, with all the aforementioned problems. The two are completely disconnected and can be different (again, why?). The OS sometimes looks at the PE version number and sometimes looks at the assembly version number depending on whether you're trying to install it in the GAC or whether or not you're trying to get MSI to update the installed version of a dll side-by-side to the application (again, why?).

In .NET land, Microsoft allows the developer to set the version number in code using the AssemblyInfo.cs file, which is auto-generated with visual studio projects... Awesome, you set the attribute in the code and all the version numbers come out the same.

So the most pragmatic solution for an automated build process seems to be to dynamically change this version number. Whats the big deal?

This looks like a java based attempt at the same, except it isn't really needed if all it's trying to do is output some text in response to being called with the "--version" argument.

So i guess that is TRWTF?

Steve The Cynic · 2012-12-10 Reply Admin

Remy Porter:
Because there should be a direct mapping of the code source control and the code in the build. By adding a code-generation step to the build process, you break that connection.
Even Microsoft's tools generate code pre-compilation so that you actually check the generated code into source control. Now, in this case, someone was going back and checking these TagFile.java files back into source control, which is also pretty terrible, since they're going to be replaced every build, unless the process checks it back in automatically, and then you have this metadata file in source control that doesn't actually mean anything and argh.

I oversee the build process in my company, and I would never want to see something like this.

Um, but there is a 1-to-1 correspondence between what's in source control and what's in the built code. It's just more indirect than normal.

"Indirect"? Indeed. Unless you code directly in binary machine code or JVM byte code, there's an indirect relationship between the input files and the final release package. A collection of .java files is processed by javac, and the .class files are bundled together with appropriate metadata and other resources to produce one or more .jar files. We've just modified the process so that one of the inputs to one or other part of the process is generated before javac and/or the .jar builder gets hold of it. This allows the embedding of per-build variable data without having to check that one-time-only data into source control. In C or C++ you ca generate build date/time information by using the predefined DATE and TIME macros, and therefore don't need to do this sort of trickery. You could do something similar in a Windows .rc file, since that is also processed using the same preprocessor.

But either way, you have an invariant part (GenerateTagFile.java in the article) that generates per-build-variable data in a way that gets it included in the build without having to check it in for every official build. As others have mentioned, there may be better ways of doing this in the particular case of Java, but the fundamental issue remains: you want build date/time/culprit data in the build process's output, and you don't want to have to check that data into the process every time. You have limited choices.

Another approach is to have a two-stage build process that does a clean checkout from a suitably-encoded source control label (or other tagged version-set), and generate a build-label-file as part of the check-out process. Once that's done, you proceed to the actual build.

The end result is the same. You know (or can derive) the date, time, and culprit of the build, and you have source-control information on which versions of what files were included. So in effect, we are merely arguing about the exact nature of the build-label-file.

2012-12-10 Reply Admin

Dion:
There are worse ways to do that.

Yeah - a better way may involve pushing the data fields and methods down to a base class, and having each individual tags.java file inhertit from that class, but with the constructor providing the location-specific details for each subclass.

So, a quasi-WTF. The architect could've architected this bit better.

2012-12-10 Reply Admin

If you didn't have code generating code, you would not have the potential for a glitch to suddenly give your computer artificial intelligence.

2012-12-10 Reply Admin

what happened to Nagesh?

Steve The Cynic · 2012-12-10 Reply Admin

asdf:
what happened to Nagesh?

Speak not the name of the Spawn.

2012-12-10 Reply Admin

Got bored, probably. People stopped biting.

2012-12-10 Reply Admin

I just solved this problem today in one of my projects. With Maven it's deceptively simple. Add to your pom.xml

<properties>
   <build.timestamp>${maven.build.timestamp}</build.timestamp>
   <maven.build.timestamp.format>yyyy-MM-dd HH:mm</maven.build.timestamp.format>
</properties>

<build>
  <resources>
     <resource>
        <directory>src/main/resources</directory>
        <filtering>true</filtering>
     </resource>
  </resources>
</build>

Then just make sure you have some properties file:

# build.properties
build.timestamp=${build.timestamp}

Load the properties file in your code, and access the build timestamp at runtime. So easy; if you're building with Maven.

2012-12-10 Reply Admin

Seriously how hard is it to use a tool like Hudson and generate a Manifest with this information.

2012-12-10 Reply Admin

As other people have already pointed out, the way to do this "by the book" in Java is to write a proper MANIFEST.MF file with any kind of version information you want it to have. That can be read out when the software is running in order to provide version info to the user, and most of all it can also be read out by other tools you may use to manage your software, like deployment tools and stuff, since there's a number of standard elements in MANIFEST.MF files which define - among other things - a standard way to tag version numbers and build dates.

If you even start to modularize your software, so it doesn't all end up in one big jar, but multiple of them, which may even be built individually and used in various combinations, you'll really start to enjoy having a proper version number injection in your build process which tags each single module/jar of your software with version and build numbers, dates etc. - one of the hells to be avoided on the path to modularization is version hell, in which you don't really know which different versions of your stuff you've got running together and thus can't track problems down to specific versions of modules. MANIFEST.MF files, properly used, enable you to provide detailed versioning information for each of your build artifacts individually; you may even add automatic checks to ensure that only specific versions of modules can actually run together, such as only those with the same major and minor version number.

2012-12-10 Reply Admin

The real WTF, then, is:

import java.util.*;

when it really should be invoking just the classes required.

I would also take issue with use of "Hashtable" - it ought to HashMap, using Map as the interface. Anything else is simian.

Oh, and somehow the code generator knows how to generate that first line using sb.append("importjava.util.*;\n\n"); ...

2012-12-10 Reply Admin

TRWTF is storing the date/time stamps as strings, instead of using Calendar instances.

Sutherlands · 2012-12-10 Reply Admin

caffiend:
I don't think there is actually a good solution to this problem on any platform. Since I'm not experienced with commercial software development on platforms other than windows, I won't pretend to be an expert. AFAIK, no other platform attempts to embed version information in an executable in a standardized way (but i could be wrong, but nobody has said so yet).

As already has been said, for Java, you use the manifest file.

caffiend:
If you're using .NET, there are two version numbers (who knows why), the assembly version number (which is a .NET metadata thing) and the PE version number, which sits in the traditional spot in the PE header, with all the aforementioned problems. The two are completely disconnected and can be different (again, why?). The OS sometimes looks at the PE version number and sometimes looks at the assembly version number depending on whether you're trying to install it in the GAC or whether or not you're trying to get MSI to update the installed version of a dll side-by-side to the application (again, why?).

One is updated any time you update the DLL, the other is updated only when you make a breaking change. When you're installing it in the GAC, if the second number is the same, it will overwrite the old version and the clients will automatically begin using it. If that number is different, it will install it side-by-side and clients will use the version they were compiled against. I'm sure you can understand the need for a version number separate from this...

2012-12-10 Reply Admin

Hey, at least they used a StringBuffer to glom that String together...

PedanticCurmudgeon · 2012-12-10 Reply Admin

asdf:
what happened to Nagesh?

Apparently, based on all the not-a-WTF posts here, Nagesh has been assimilated into the collective unconscious.

2012-12-10 Reply Admin

I'm so proud that my method doIt() has survived so many companies over so long a time. It's coming up on 20 years.

I never worked on Java though. Are you sure this wasn't doit.bat? :)

2012-12-10 Reply Admin

The real WTF is when I had to move from CVS to ClearCase and found that there was no such thing as tag substitution. Really, WTF? So everyone is left to reinvent that particular wheel.

This isn't a WTF, just more code snobbery.

2012-12-10 Reply Admin

Why does so many people want the version information to be in a separate file? If the version information doesn't get included in the executable being build, I don't see what value it adds?

If you distribute the version number and executable as separate files, it is just too easy to replace the executable but keep the version number the same.

2012-12-10 Reply Admin

You can have tag substitution in ClearCase by using suitable triggers. The problem is: What kind of information do you want to embed into your sources? Considering the crazy flexibility of ClearCase views (hint: Config Specs), you will have a hard time just to define a manageable and useful amount of version information.

2012-12-10 Reply Admin

Unless you have a very good reason (see: AI work, working in a language with insufficient abstraction capabilities, etc.), writing code generating code means you're doing it WRONG.

Especially if it's for something as trivial as tagging a build, which is a solved problem.

2012-12-10 Reply Admin

Kasper:
Why does so many people want the version information to be in a separate file?

Overzealous application of "good design principles".

One practice that is normally a good thing is the separation of code and data. For example, if you ran across a program that had to be recompiled every time the host IP address changed, you'd rightly consider that a WTF. If there's information that can vary, it's almost always a good idea to factor that out into a configuration file, database or similar.

That's the thought process. "The versioning information changes constantly. 'Best practices' would be to generalize it out into some sort of data storage system." - Again, usually this would be the right thing to do, but as you mention, the tight coupling between the version information and the actual code is a highly desired feature.

In short, blindly following "best practices" isn't always for the best.

2012-12-10 Reply Admin

As a not-a-programmer, I enjoy the discussions around the borderline WTFs more than anything on this site. It seems to me that this type of article exposes philosophical WTFery rather than objectively bad code.

How closely should you couple build information with your compiled code? If you're going to put the name and date in, why not your compiler version? Or a representation of your entire build environment? Why not include what you had for breakfast last Thursday?

It would be interesting to hear the reasons why this system was implemented in this precise way.

CAPTCHA: decet (cet phasers to kill, then decet them before writing your report)

2012-12-10 Reply Admin

Does nobody place any value on being able to reproduce an older version of the executable from source control?

2012-12-10 Reply Admin

Unless you have a very good reason (see: are actually some kind of god), making blanket statements about methods you prefer not to use means you're doing it WRONG.

I can think of several good uses for code generation. This isn't exactly one of them, but you don't have to be doing AI research to save yourself time and take human error out of the equation.

One Version

Leave a comment on “One Version”