EDIT 2022-01-10: There's now an official dump that can be found here. The database is danbooru1.danbooru_public
.
My dump is no longer maintained.
Edit by @nonamethanks - 2020-11-21
Since BigQuery old-style links are dead, the database can now be found at this link
The database name requires the whole name to be selected, so use
danbooru-data.danbooru.posts
instead of the old[danbooru.posts]
See forum #178212 for an example of updated query.Danbooru now also has its official bigquery dump of most data, not just posts, see this commit for the URLs.
With some assistance from albert, I've set up a Google BigQuery data dump of the Danbooru posts tables so anyone who cares to do so can run queries. You can access the dump here. You may see Konachan and Yandere tables there too but those aren't complete so I don't recommend bothering with them.
It's updated nightly and should contain basically anything you can get through the API. If you really want to, you can download the whole thing as a dump using these instructions. Should save you from trying to scrape the API or something silly like that.
The query syntax is very similar to SQL, with a few differences due to fancier datatypes like repeated/nested fields.
An example query:
SELECT tags.name, COUNT(id) AS num, SUM(file_size)/1000000000 AS GB FROM [danbooru.posts] GROUP BY tags.name ORDER BY GB DESC LIMIT 20;
This returns the tags with the most file_size associated with them. For example this query will show you that 1girl has just over a terabyte of image data over 1.6 million images.
SELECT favs, COUNT(id) AS num, SUM(file_size)/1000000000 AS GB FROM [danbooru.posts] GROUP BY favs ORDER BY GB DESC LIMIT 20;
This groups by user favorites instead of tags. You can see that user 19831 has 414000 favourites, which are around 380GB.
Updated by nonamethanks