06.12.2017

The Advent of Void: Day 6: jq

Day 6! Today we want to get some pseudo modern web technology in the game. jq(1) is a commandline tool that allows to query and manipulate JSON documents and is great to use with shell scripts.

Let’s get started with this simple JSON document:

{
 "title" : "The Advent of Void: Day 4: containers",
 "pubDate" : "2017-12-04 00:00:00",
 "link" : "http://voidlinux.eu/news/2017/12/advent-containers.html",
 "guid" : "http://voidlinux.eu/news/2017/12/advent-containers"
}

To query the title of this element you can use the following:

# jq '.title' file.json
"The Advent of Void: Day 4: containers"

Now, this is still formatted as a JSON string. Let’s get the plain string we can pass -r on the shell:

# jq -r '.title' file.json
The Advent of Void: Day 4: containers

To get both the title and the pubDate, just add it as a comma seperated list:

# jq -r '.title, .pubDate' file.json
The Advent of Void: Day 4: containers
2017-12-04 00:00:00

That’s quite helpful. But the real power of jq comes to light once we work with JSON Arrays. Let’s have a little more complex example:

[
  {
     "guid" : "http://voidlinux.eu/news/2017/12/advent-containers",
     "pubDate" : "2017-12-04 00:00:00",
     "link" : "http://voidlinux.eu/news/2017/12/advent-containers.html",
     "title" : "The Advent of Void: Day 4: containers"
  },
  {
     "link" : "http://voidlinux.eu/news/2017/12/advent-ministat.html",
     "title" : "The Advent of Void: Day 3: ministat",
     "guid" : "http://voidlinux.eu/news/2017/12/advent-ministat",
     "pubDate" : "2017-12-03 00:00:00"
  },
  {
     "link" : "http://voidlinux.eu/news/2017/12/advent-taskwarrior.html",
     "title" : "The Advent of Void: Day 2: taskwarrior and friends",
     "pubDate" : "2017-12-02 00:00:00",
     "guid" : "http://voidlinux.eu/news/2017/12/advent-taskwarrior"
  },
  {
     "link" : "http://voidlinux.eu/news/2017/12/advent-gcal.html",
     "title" : "The Advent of Void: Day 1: gcal",
     "guid" : "http://voidlinux.eu/news/2017/12/advent-gcal",
     "pubDate" : "2017-12-01 00:00:00"
  }
]

Assume we want a list of all links and titles as a CSV in the document: Our first step is to extract them both from the document:

# jq -r '.[] | ( .title, .link )' file.json
The Advent of Void: Day 4: containers
http://voidlinux.eu/news/2017/12/advent-containers.html
The Advent of Void: Day 3: ministat
http://voidlinux.eu/news/2017/12/advent-ministat.html
The Advent of Void: Day 2: taskwarrior and friends
http://voidlinux.eu/news/2017/12/advent-taskwarrior.html
The Advent of Void: Day 1: gcal
http://voidlinux.eu/news/2017/12/advent-gcal.html

That’s all that’s needed to query the information. To format the output we need to generate an array from every article:

# jq -r '.[] | [.title, .link ]' json
[
  "The Advent of Void: Day 4: containers",
  "http://voidlinux.eu/news/2017/12/advent-containers.html"
]
[
  "The Advent of Void: Day 3: ministat",
  "http://voidlinux.eu/news/2017/12/advent-ministat.html"
]
[
  "The Advent of Void: Day 2: taskwarrior and friends",
  "http://voidlinux.eu/news/2017/12/advent-taskwarrior.html"
]
[
  "The Advent of Void: Day 1: gcal",
  "http://voidlinux.eu/news/2017/12/advent-gcal.html"
]

Now as a last step join the arrays using the tab character:

# jq -r '.[] | [.title, .link ] | join("\t")' file.json
The Advent of Void: Day 4: containers    http://voidlinux.eu/news/2017/12/advent-containers.html
The Advent of Void: Day 3: ministat http://voidlinux.eu/news/2017/12/advent-ministat.html
The Advent of Void: Day 2: taskwarrior and friends    http://voidlinux.eu/news/2017/12/advent-taskwarrior.html
The Advent of Void: Day 1: gcal http://voidlinux.eu/news/2017/12/advent-gcal.html

All in all jq is a quite powerful tool and we did only scratch the surface of what you can do with jq. For more comprehensive documentation consider the jq Manual and try it online.


Copyright 2008-2017 Juan RP and contributors

Linux® is a registered trademark of Linus Torvalds (info)