Java Angst

Java is a beautiful, inspiring idea. It offers simplified object-orientation and the promise of cross-platform execution. It has been widely adopted, and there are many libraries, both free and commercial, available for this relatively young language.

Someone interested in developing with Java has many obstacles, many of them are not easily solved. They do not have any particular order:
-tools.
-the language
-the libraries
-project organization
-the community
-synthesis

Creating your first, non-trivial application is not easy. The GUI classes are many and confusing. There are some naming inconsitencies. You must know many classes across several packages to get even the simplest GUI program to work. Further, there are often advanced syntax idioms used in GUI programming.

The problem grows as your needs do. The hierarchy is deep and difficult to understand from Javadocs alone.

As you start developing, you will need access to more and more libraries, each with its own quirks and features.

Perhaps most confusingly is describing the information out there about Java. There are specifications and implementations. There are competing specifications for a single concept, and there are competing implementations for a single specification, sometimes from the same vendor. This introduces the concept of hierarchical evaluation of software:
-evaluate the concept,
-evaluate the spec
-evaluate the implementation

This applies to the 3 major types of software:
-tools
-applications
-frameworks

All of this happens with any software. But usually, it happens inside a single organization, hidden from view. Note also that the concrete deliverables may implement several specifications, and may even address a concept without a particular specification.

For the most part, as a beginner, you can avoid all of that safely. But awareness is necessary if only so that you can filter out what you don't need. This ability to ignore what you don't need is absolutely critical, and also very difficult since at the same time you must actively seek new resources.

As far as resources are concerned, learning the language gets the most attention. Next comes the specific libraries, such as Swing and JDBC. Resources can be found:
web sites
mailing lists
newsgroups
books
magazines
user groups

The synthesis resources include design patterns, integration tutorials, and even certification. This document is a sort of meta-synthesis resource - it describes how to organize the other resources.

====
How to evaluate software
This is something which has plagued me for years. How do I keep track of the software on my system? How should I keep notes? When should I stop evaluating and either uninstall or use it during production? How can I keep evaluations from impacting my main work?

Over time, I have developed a few conventions that have really helped me. When I download, I also get the URL I downloaded from. I also download into a folder named after the software and the version, even if this info is in the zip file name.

For purchased software, I add an Outlook contact for the maker of the software, with an informal entry in the notes section containing things like registration keys and date, place of purchase, and price.

I then create a subdirectory in the installation directory, where I will keep notes and anything else. I call this the same name across all software, "josh". (This makes it easy to identify when backing up.) Something that is *very* handy is TextPad, a text editor that lets you collect documents in a workspace. With one click, I can open all the configuration documents for a program, and perhaps even readmes.

I then read through the documentation, taking notes, concentrating on installation and configuration issues. Later, i will read through it again, looking for usability enhancements and other esoteria.

Finally, I start using the software, using the josh directory for any extra evaluation data I might need. If the software is good, I keep it. If it is bad, I will remove it, and keep my notes on why I didn't like it.

It is very important to stay aware of who made the software and where they can be contacted, even in the case of free software. I use outlook contacts for this.

======
What my choices are today
-Windows 2000. The OS is an underestimated component of the devleopment environment. Win2k has superior file management capabilities. Further, it provides a platform for some other good tools. I use Ant to interface with file tools in batch mode.
-Cool desk 99. This is a simple desktop switcher that is indispensible. I like to associate desktops with purpose, having the first desktop be a core purpose and getting less focused as the number increases. This type of tool is critical with an SDI interface, and when you have many applications runnign concurrently (more common with today's cheap memory). Note that many Unices come with this free.
-Netbeans. This is a free tool supported by Sun. It has many flaws, but the GUI designer and editor are first rate. Tends to feel a bit slow. It is written in Java so you are eating your own dog food.
-Ant. This is a free build tool by the Apache Jakarta project. It integrates well with Netbeans and gives me flexibility in my builds.
-JUnit. This is a free unit-testing framework and has Ant integration. By convention, my tests are "black box" in a seperate package. This is only moderately useful, and I haven't fully integrated it into my developement. It is not useful for GUI testing.
-MySQL with MySQLAdmin. This is a free RDBMS. I also use the MMMYSQL JDBC driver and the Netbeans database explorer.
-Jikes. A free java compiler from IBM which is fast and gives better error messages than javac from Sun.
-Sun JDK tools. There are about 20 tools that come with the JDK. I use Ant to interface with most of them, such as javac and Javadoc.
-TextPad. Commercial text editor that has some nice features, and is much lighter weight than the Netbeans editor, but much more capable than notepad. I would prefer VisualSlickEdit but it is too expensive.
-MS Office. I use Outlook for email and contacts, Outlook express for news, and IE as a web browser. I usually do not use Word for writing, unless it needs to be fancy.
-Java libraries. There are 3 things I like to get with any library: the library binaries, the source, and the docs. All three can be mounted in netbeans and integrated into the development process. Bleeding edge and difficult libraries might have books, websites, mailing lists and newsgroups. In general, my first step is to try to run an example. Next I read the docs and try to create something simple with the library. I ramp up the complexity, keeping notes along the way. Note that even for libraries I will have an outlook contact.

General Resources:
java.sun.com
javaranch.com
jguru.com
Peter's faq
Roedy's glossary
javaworld (includes forum)
IBM java
google

Books:
JVM spec (online)
JL spec (online)
Java in a Nutshell
Thinking in Java
....

Specialized Books:
Refactoring, Martin Fowler
Deisgn Patterns, GoF
Distributed programming with Java, Mahmoud
Java Servlet Programming, Hunter
Designing software with UML

Each tool has one or more sites. Ant in particular has a lot of disparate webspace devoted to it.

Swing and the JFC is the single most important set of libraries for classical application development. There are many sites devoted to this special topic:
-Sun http://java.sun.com/products/jfc/tsc/

A more topic oriented approach is called for, as many of these sites are large. For this, google can be used.

A very important topic is the use of databases with Java in general and Swing in particular. This is more of a design problem, but still should get more discussion than it does.

There are several other important topics:
concurrent multiuser java (threading, java.io, java.net)
security (both platform and network)

A non-traditional area is that of CGI programming. This is a topic unto itself. In general, this is simpler than Swing programming.

Another important area is Java XML support. More and more data formats are expressed in XML, and these libraries are helpful. As of 1.4 these are included in the base JDK. However, there are lots of 3rd party alternatives.

======
Tools I haven't found:
beautifier (nb reformat is ok)
lint
grep (nb is ok)
java 2 html for sharing code online, nicely.
A good UML CASE tool. Argo and Posiedon are OK. Haven't tried the commercial stuff.
A better front end to MySQL. I want graphical table creation and relationship editing like with MS Access.
A tool to keep track of the software on my system!
An open source visual slick edit clone. Maybe a highly tweaked emacs will do.

Tools I'd like to replace:
Newsreader - Some annoyances. Needs better filtering. I used to use forte and tin. Would like to write my own.
Office - Just don't like MS. Star Office is unusable and buggy.
Win 2k - Linux is ok, but X is terrible, unusable. Plus, some programs I have won't run under Linux, and have no clear replacement (Quicken).


======
There are some topics which are language and platform dependant. How to handle these? These are concepts. Concepts may relate to each other, and compete. However, they cannot be equivalent.

Data structures
Algorithms
User interface
Data consistency across layers
Layered software
Design patterns
MVC
Singleton
Architecture
Modeling languages
UML
OOP
CASE
Concurrent programming
Multiuser programming

Some elements are language specific, some are independant. This applies to every level of our hierarchy.

=====
Anyway, once you start coding, a few things will become apparent:
1 - There aren't enough components (beans). In particualr, there is no date component, or table component. There is no spinner control. There appears to be no single definitive clearing hoouse for javabeans.
2 - The components that are there are very complex, out of the box. Texfield validation is the classic example. The powerful JTable and JTree are complex, too, but they have justification. On the other hand, Swing has some components missing from other toolkits, such as a pretty good HTML renderer/editor. In the Textfield case, the complexity can be mitigated by using a specialized 3rd part subclass.
3 - You may miss the integrated database operations found in microsoft products, like Access. The design-time support for database is also lacking (such as dragging a field onto the form to make a bound control).
4- This is much more flexible.
5- Layout managers are a new concept for many people (me included). There is a steep learning curve here to get reasonable looking layouts (most require GridBagLayout).

So we are faced immediately with the prospect of creating our own bean, since the common ones aren't available. We must leave this for later, and work with what we have, get a feel for being a client of a bean, and then right your own. (Also, chances are the bean will be a composition of other beans, so learning the premade ones is important, anyway.)

There are two ways to go: elementary or top down. That is, we can examine each bean in detail, or we could try to create something useful first, then worry about details later. There are merits to both, and in practice one does both. One good compromise is to do the elementary thing keeping an eye on what you want to accomplish, allowing you to skip ahead to the juicy parts. (This has the risk that you might not recognize the juicy parts. This risk diminishes as your experience increases.)

Another relatively complex thing that shoudl be simple is resource management - things like icons.

Finally, there is support for "simplifying" GUI development. n particular, the "Action" stuff.

====
Data types

Computers only understand bits. We humans assign meaning to these bits. Fundamentally, we collect the bits and call them numbers, sepcifying an algorithm to convert between a bit pattern and a number. Goind further, we can collect numbers and associate them with letters or dates or pictures.

Letters are particularly interesting because there is generally a 1-1 coorespondance between those and numbers, and the mappings themselves differ over time and space. Many mappings exist, so it is important to specify the mapping when dealing with numbers representing letters.

Dates are interesting too, for different reasons. There are various conventions for time keeping. The existance of time zones and calendars makes things even more complex.

Letters, numbers, and dates are all expressible as a string of letters. So in essence, we map from a number to a date to a string. These are two distinct mappings, and it happens in reverse when we write. Interestingly, there is not necessarily a 1-1 coorespondance between a letter and its numeric equivelent (its code). That is, the value of "1" is not necessarily 1.

So we see that the primacy of number and string is interlinked. In fact, one could say that the string is the fundamental unit of user interaction, and bit pattern (and its associated number) is the fundamental unit of machine interaction.

Can a bit pattern specify its own encoding? Not without a prearranged convention. The situation is equivalent to having ciphertext floating around which happens to have many well-known and possibly related keys (codepages). Some valid questions are, how many encodings are there? How can you specify an encoding?

One way to play with these concepts is to write a program that allows the user to specify a bit pattern, and then view it under different interpretations. You must decide what behavior to take if the interpretation requires a fixed lengh pattern which doesn't match the length of the pattern. Truncate? Pad with zeroes? For simple string representation of number, try base n. For a character interpretation, try a different code page, including unicode. For a date representation, try different locals.

Numeric mapping (bit order)
-Character printing (code page)
-Number printing (base)
-Date printing (locale)

Note that for a given pattern, you must choose the bit order and do the mapping first. Then you can do all three. Generally, the three interpretations are not interdependant. That is, to go from one to the other you go through the numeric mapping as an intermediary.

One more thing: Number and date printing generally result in multiple characters. That is, a single number is mapped into several numbers which are in turn interpreted as characters.

One can see the interest in Unicode. Now a process moving through a long bit pattern need not track which is the active codepage. There is only one codepage, a union of all the others. There is only one context to maintain. This is a good thing.

(ironically there is even more mapping between keyboard and memory. the physical position to a eletronc signal to a number to another number...)