Having done similar work in the past, I used the Apache Velocity and POI libraries to write a handy utility. The Velocity library is a templating engine, so I can write a test script, insert some bind variables into it and have Velocity do my variable substitution for me. In addition, Velocity allows you to build logic into the templates, meaning based on our bind variables, the template can be generated in different ways! The second library, Apache POI is used to read in a spreadsheet, this spreadsheet contains all the bind variables.
Combining these two things together, you provide a template and a spreadsheet and the code below produces as many files as there are rows in the spreadsheet, each file having the row cell values populated in the spreadsheet.
Starting off, lets define some member variables in our code. These are used to store the first and last rows numbers in the spreadsheet and first and last columns (cells). The spreadsheet itself and the first row of the spreadsheet are referenced as well as the template being used to generate the data.
The main() method does some basic house keeping like checking whether the files that are passed in as command line arguments exist, and then initialises the class and calls the generate() method.
In the constructor, the spreadsheet is loaded, the first/last rows and columns are worked out, Velocity is initialised and the template is loaded.
The first row of the spreadsheet is stored as a member variable because this effectively contains all of the variable names (or keys) that we are going to be using in the template. The code later on uses these variable names inside the Velocity context to bind the variables to their values.
Velocity uses resource loaders when reading the templates, these need to be configured in the velocity properties file. I put this file into the classpath and load it as a classpath resource.
The contentes of this file are as follows:
In the file above, the "file.resource.loader.path" property has to be set to the working directory where the code is running and where the template file is. If this is not set correctly, Velocity will not be able to load the template.
The generate() method is where the real work gets done. This method loops through all of the rows in the spreadsheet (excluding the first row). For each of the rows a new VelocityContext is created, this is where the variable names from the first row are bound to the variable values from the row that's in the loop.
Once all of the variables are bound to their corresponding values, the template and context are merged, generating the output into the writer that was specified. The writer writes the data to a file.
There is a bit of a hack to do with the text values and numeric values in cells. I found that sometimes Excel doesn't set the cell type correctly. Typically I only work with integer type values that are represented as text (they have leading zeros sometimes), so my hack is to convert all numeric type cells into integers.
The spreadsheet looks like this:
To test this I used just a simple dummy template as follows.
Running the above code produces two files, test1.dat and test2.dat. The content for these is:
Using this code we can now start producing many data files, or even code and scripts. The templates can also have logic built into them to generate slightly different data.