Archive files using Ant from Groovy

September 12, 2011

In Ant The zip task takes a fileset but only creates one archive. How do you zip files into individual archives?

Use case
One use case for this is log files. Sooner rather then later, every system must deal with the growth of log files. One of the best utilities for handling this is logrotate. This is available on *nix systems. There are probably native versions that do similar tasks in Windows, but I did not locate any ‘free as in beer’ ones.

Approach
Since I’m working with Ant, on Windoze, and can’t install cygwin or other system, will use Groovy to get around Ant limitations. This will not be an attempt to recreate logrotate, of course.

As in a previous post, we can use the Groovy AntBuilder and iterate a fileset to repeatably call an Ant task. In this case we’ll be invoking the zip task. Note that the same approach could be used to create a scriptdef to do this directly in an Ant build script. This would be more usable since we could declaratively define the fileset, etc.

Example

First the configuration. In listing 1, we use a JSON configuration file. No reason, I’m just tired of Property files.

Listing 1, JSON configuration file

{ "about":
    {
    	"project":"archivelogs",
    	"author":"jbetancourt",
    	"date":"20110909T2156"    	
    },
  "config":
	{
		"basedir":"test/logs",
		"archivedir":"test/logs/archive",
		"includes":"*.log",
		"propertyName":"found.files",
		"when":"before",
		"update":"true",
		"days":"1",
		"level":"9"
	}
}

In listing 2, the main entry point calls the config loader and then the ‘archiveLogs’ method.

The fileset created with the AntBuilder uses an Ant ‘date’ selector to find the files that are x amount of days old. Then the resulting fileset is scanned in a loop, invoking the zip task as we go and then deleting the just zipped file. It is assumed here that the target file name has descriptive information, like a date. And, we are not adding a “rotate” counter to the name for future logrotate like behavior.

Listing 2, Groovy archive source

package ant

import org.apache.tools.ant.*
import java.text.*
import groovy.json.JsonSlurper

/**
 * @author jbetancourt
 */
class ArchiveLogs {
	def format = "yyMMdd'T'kkmm"
	def df = new SimpleDateFormat(format)

	/** entry point */
	static main(args) {
		def als = new ArchiveLogs()	
		
		def p = (
		  (args) ?
		  args[0] : "test/conf/archivelogs.json")
				
		def map = als.loadConfiguration(p)
		als.archiveLogs(map)
	}

	/**  */
	def archiveLogs(conf) {
		def base = conf.base
		def targetDateString = df.format(
			(new Date() - Integer.parseInt(conf.days))
		)
		
		def ant = new AntBuilder()		
		ant.fileset( 
			dir:conf.basedir, 
			includes:conf.includes, 
			id:conf.propertyName){			
				date( 
					datetime:targetDateString,
					when:conf.when,
					pattern:format)
		}	
		
		def ref = ant.project.getReference(
			 conf.propertyName)
		
		ref.each{ fileResource ->
			File file = fileResource.getFile()
			def name = file.getName()
			
			ant.zip(
				basedir:conf.basedir,
				destfile:
				 "${conf.archivedir}/${name}.zip",
				update:conf.update,
				level:conf.level,
				includes:name
			)
			
			ant.delete(verbose:"true",file:file.getPath())			
		}		
		
	}	
	
	/** load json file */
	def loadConfiguration(String path){
		def slurper = new JsonSlurper()
		def obj = slurper.parse(
			 new File(path).newReader())
		
		if(obj.config.debug == 'true'){
			obj.config.each{
				println it
			}		
		}
		
		return obj.config
		
	}
}

Caveats
This is, of course, a simplistic approach. For example, what if the file is being used when we attempt to delete it? Or we fail to zip the file, but continue to delete it? And, this source, being an example, does not have any error handling.

Alternatives
The ant-contribs project has looping support, foreach. Directly using the Java util packages for archiving, such as java.util.zip. Or, the libraries found in, for example, the Apache commons project.

Further Reading


“cafe”, egberto gismonti


JSON configuration file format

May 8, 2011

JSON is a data interchange format. Should it also be used as a configuration file format, a JSON-CF?

Overview

Had to write yet another properties file for configuration info. Started to think that maybe there are better alternatives. Wondered about JSON for this.

Requirements

What are requirements of a configuration file format?

  • Simple
  • Human readable
  • Cross platform
  • Multi-language support
  • Unicode support

Looks like JSON has all the right qualities.

If all you want to pass around are atomic values or lists or hashes of atomic values, JSON has many of the advantages of XML: it’s straightforwardly usable over the Internet, supports a wide variety of applications, it’s easy to write programs to process JSON, it has few optional features, it’s human-legible and reasonably clear, its design is formal and concise, JSON documents are easy to create, and it uses Unicode.
— Norman Walsh, Deprecating XML

JSON-CF Limitations

  • Instead of angle brackets as in XML, we have quotation marks everywhere.

What does it need?

  • Inline comments, see for example, json-comments
  • Interpolation (property expansion)
  • Namespaces
  • Inheritance
  • Includes
  • Date value
  • Schema
  • Cascading

Example

{
    "_HEADER":{
            "modified":"1 April 2001",
            "dc:author": "John Doe"
    },
    "logger_root":{
            "level":"NOTSET",
            "handlers":"hand01"
     },
    "logger_parser":{
            "level":"DEBUG",
            "handlers":"hand01",
            "propagate":"1",
            "qualname":"compiler.parser"
    },
    "owner":{
             "name":"John Doe",
             "organization":"Acme Widgets Inc."
     },
     "database":{
             "server":"192.0.2.62",     
             "_comment_server":"use IP address in case network name resolution is not working",
             "port":"143",
             "file":"payroll.dat"
      }
}

Programmatic Access using Groovy

Now we can easily read this file in Java. Using Groovy is much easier, of course. Groovy version 1.8 has built-in JSON support, great blog post on this here.

import groovy.json.*;

def result = new JsonSlurper().
                          parseText(new File("config.json").text)

result.each{ section ->
	println "$section\n"
}

>groovy readConfig.groovy
Resulting in Groovy style data structure, GRON, (look ma, no quotation marks):

logger_parser={qualname=compiler.parser, level=DEBUG, propagate=1, handlers=hand01}

owner={organization=Acme Widgets Inc., name=John Doe}

_HEADER={dc:author=John Doe, modified=1 April 2001}

database={port=143, file=payroll.dat, server=192.0.2.62, _comment_server=use IP address in case network name resolution is not working}

logger_root={level=NOTSET, handlers=hand01}

In Groovy you can access the data with GPath expressions:
println “server: ” + result.database.server

You can also pretty print JSON, for example:
println JsonOutput.prettyPrint(new File(“config.json”).text)

Summary

Raised the question of the use of JSON as a configuration file format.

What I don’t like is the excess quotation marks. YAML is more attractive in this sense. But, the indentation as structure in YAML, similar to Python, may not be wise in configuration files.

Well, what is the answer, should there be a JSON-CF? I don’t know. A very perceptive caution is given by Dare Obasanjo commenting on use of new techniques in general:

So next time you’re evaluating a technology that is being much hyped by the web development blogosphere, take a look to see whether the fundamental assumptions that led to the creation of the technology actually generalize to your use case.

Updates

  • After writing and posting this I searched the topic and, of course, this is not a new question. Search. I updated the reading list below with some useful links.
  • JSON Activity Streams are an example of how JSON is used in new ways.
  • schema.org types and properties as RDFS in the JSON format: schema.rdfs.org/all.json
  • Just learned that node.js uses the NPM package manager which uses a JSON config file format.
  • Jan 7, 2012: Java JSR 353: Java API for JSON Processing

Further Reading
- A very simple data file metaformat
- JSON configuration file format
- GRON
- Cascading Configuration Pattern
- NPM configuration file format
- XML or YAML for configuration files
- Using JSON for Language-independent Configuration Files
- INI file
- JSON+Comments
- Comparison of data serialization formats
- Data File Formats, in Art of Unix Programming, Eric Steven Raymond.
- ConfigParser – Work with configuration files
- JSON
- java.util.Properties
- RFC 4627
- XML-SW,a skunkworks project by Tim Bray. Brings a bunch of XML complex together into one spec.
- YAML
- ISO 8601
- Learning from our Mistakes: The Failure of OpenID, AtomPub and XML on the Web
- Groovy 1.8 Introduces Groovy to JSON
- JSON-LD
- JSON Schema

 


” Sacred Place ” , R.Towner / P. Fresu, live in Innsbruck , Part 4


Groovy Object Notation (GrON) for Data Interchange

May 17, 2010

Summary

Foregoing the use of JSON as a data interchange when Groovy language applications must interact internally or with other Groovy applications would be, well, groovy.

Introduction

JavaScript Object Notation (JSON) is considered a language-independent data interchange format. However, since it is based on a subset of the JavaScript (ECMA-262 3rd Edition) language, this is not fully correct. In fact, many languages must marshal to and from JSON using external libraries or extensions. This complicates applications since they must rely on more subsystems and there may be a performance penalty to parse or generate an external object notation.

If an application must only interact within a specific language or environment, such as the Java Virtual Machine (JVM), perhaps using the host language’s data structures and syntax will be a simpler approach. Since Groovy (a compiled dynamic language) has built-in script evaluation capabilities, high-level builders (for Domain Specific Language (DSL) creation) , and meta-programming capabilities, it should be possible to parse, create, transmit, or store data structures using the native Groovy data interchange format (GDIF), i.e., based on the native Groovy data structures.

Syntax example

Below is an example JSON data payload.

JSON (JavaScript) syntax:

{"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {"value": "New", "onclick": "CreateNewDoc()"},
      {"value": "Open", "onclick": "OpenDoc()"},
      {"value": "Close", "onclick": "CloseDoc()"}
    ]
  }
}}

Below is the same data payload; this time using Groovy syntax. Note that there are not too many differences, the most striking is that maps are created using brackets instead of braces. It looks simpler too.

Groovy syntax:

[menu: [
	id: "file",
	value: "File",
	popup: [
	menuitem : [
	 [ value: "New", onclick: "CreateNewDoc()" ],
	 [ value: "Open", onclick: "OpenDoc()" ],
	 [ value: "Close", onclick: "CloseDoc()" ]
     ]
  ]
]]

Code Example

/**
 * File: GrON.groovy
 * Example class to show use of Groovy data interchange format.
 * This is just to show use of Groovy data structure.
 * Actual use of "evaluate()" can introduce a security risk.
 * @sample
 * @author Josef Betancourt
 * @run    groovy GrON.groovy
 *
 * Code below is sample only and is on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
 * or implied.
 * =================================================
*/
 */
class GrON {
    static def message =
    '''[menu:[id:"file", value:"File",
     popup:[menuitem:[[value:"New", onclick:"CreateNewDoc()"],
     [value:"Open", onclick:"OpenDoc()"], [value:"Close",
     onclick:"CloseDoc()"]]]]]'''

    /** script entry point   */
    static main(args) {
       def gron = new GrON()
       // dynamically create object using a String.
       def payload = gron.slurp(this, message)

        // manually create the same POGO.
        def obj = [menu:
	    [  id: "file",
	       value: "File",
                 popup: [
                   menuitem : [
                   [ value: "New", onclick: "CreateNewDoc()" ],
                   [ value: "Open", onclick: "OpenDoc()" ],
                   [ value: "Close", onclick: "CloseDoc()" ]
          	]
           ]
         ]]

         // they should have the same String representation.
         assert(gron.burp(payload) == obj.toString())
    }

/**
 *
 * @param object context
 * @param data payload
 * @return data object
 */
def slurp(object, data){
	def code = "{->${data}}"  // a closure
	def received = new GroovyShell().evaluate(code)
	if(object){
		received.delegate=object
	}
	return received()
}

/**
 *
 * @param data the payload
 * @return data object
 */
def slurp(data){
     def code = "{->${data}}"
     def received = new GroovyShell().evaluate(code)
     return received()
}

/**
 * @param an object
 * @return it's string rep
 */
def burp(data){
     return data ? data.toString() : ""
}

} // end class GrON

Possible IANA Considerations

MIME media type: application/gron.

Type name: application

Subtype name: gron

Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32

Additional information:

Magic number(s): n/a

File extension: gron.

Macintosh file type code(s): TEXT

API

To be determined.

Security

Would GrON be a security hole? Yes, if it is implemented using a simple evaluation of the payload as if it were a script. The example shown above used evaluate() as an example of ingestion of a data object. For real world use, some kind of parser and generator for object graphs would be needed. The advantage would accrue if the underlying language parser could be reused for this.

Now this begs the question, if Groovy must now support a data parser, why not just use JSON with the existing libraries, like JSON-lib?

Is using the Java security system an alternative as one commenter mentioned?

Notes

The idea for GrON was formulated about a year ago. Delayed posting it since I wanted to create direct support for it. However, the task required more time and expertise then I have available at this time.

I was debating what to call it, if anything. Another name I considered was Groovy Data Interchange Format (GDIF), but I decided to follow the JSON name format by just changing the “J” to “G” and the “S” to “r” (emphasizing that Groovy is more then a Scripting language, its an extension of Java).

Updates

10Sept2011: See also this post: “JSON Configuration file format“.

9Feb2011: Looks like Groovy will get built in support for JSON: GEP 7 – JSON Support

I found (May 18, 2010, 11:53 PM) that I’m not the first to suggest this approach. See Groovy Interchange Format? by DeniseH.

Recently (Oct 3, 2010) found this blog post:
Groovy Object Notation ? GrON?

Mar 22, 2011: Groovy 1.8 will have JSON support built in.

Further Reading


Creative Commons License
Groovy Object Notation (GrON) for Data Interchange
by Josef Betancourt is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.
Based on a work at wp.me.


Follow

Get every new post delivered to your Inbox.