In order to support multiple blogging back-ends, it is necessary that we
work at some level of abstraction. One piece of blog software's notion
of tags isn't necessarily going to line up with another's, etc. So we
introduce the notion of a post
:
A post
is an alist consisting of the fields:
- :blog (#+POSTBLOG)
- A string naming an entry in org-blog-alist
- :category (#+POSTCATEGORY)
- A list of strings naming categories to which the post belongs
- :content (body after export)
- A string containing HTML-formatted content
- :date (#+DATE)
- A date and time for the post
- :excerpt (#+DESCRIPTION)
- A string containing an optional excerpt of the post
- :id (#+POSTID)
- A string containing a unique ID (generally numeric) for the post
- :link (#+POSTLINK)
- A string containing a link to the permanent location of the post
- :name (#+POSTNAME)
- A string containing the canonical name for the post
- :parent (#+POSTPARENT)
- A string containing a unique ID (generally numeric) for the parent of the post
- :status (#+POSTSTATUS)
- A string denoting the status (`draft', `published') of the post
- :tags (#+KEYWORDS)
- A list of strings representing the names of tags
- :title (#+TITLE)
- A string containing the title of the post
- :type (#+POSTTYPE)
- A string containing an optional format for the post
It's not absolutely essential that every field be present; parent
and
excerpt
, for instance are pretty thoroughly optional. Some fields are
really intended to be filled in by the blogging software, like id
and
link
. One thing I did do was, whenever it seemed to make sense, I used
a standard org-mode property name—so :date
is derived from #+DATE
,
for instance. Whenever I "make up" a property name, I keep it in the
#+POST_
namespace, to try and avoid collisions.
So, given a buffer, how do we get to a post? The answer is: the
org-mode
exporter.
Now the code I'm presenting here works with org-mode
< 8.0. I'm
hoping, once I've gotten this initial round of development all worked
out, that I'll be able to convert over to using that interface, which,
based on my light reading, should be somewhat nicer to work with. We'll
probably end up with our own org-blog-post
export format that will
work in a fairly standard fashion. But that's for later. For now:
(defun org-blog-buffer-extract-post ()
"Transform a buffer into a post.
We do as little processing as possible on individual items, to
retain the maximum flexibility for further transformation."
(save-excursion
(save-restriction
(let ((org-export-inbuffer-options-extra '(("POST_BLOG" :blog)
("POST_CATEGORY" :category)
("POST_ID" :id)
("POST_LINK" :link)
("POST_NAME" :name)
("POST_PARENT" :parent)
("POST_STATUS" :status)
("POST_TYPE" :type)))
(org-export-date-timestamp-format "%Y%m%dT%T%z")
(org-export-with-preserve-breaks nil)
(org-export-with-priority nil)
(org-export-with-section-numbers nil)
(org-export-with-sub-superscripts nil)
(org-export-with-tags nil)
(org-export-with-toc nil)
(org-export-with-todo-keywords nil))
(sort
(list (cons :blog (property-trim :blog))
(cons :category (property-split :category))
(cons :date (let ((timestamp (property-trim :date)))
(when timestamp
(list (date-to-time timestamp)))))
(cons :excerpt (property-trim :description))
(cons :id (property-trim :id))
(cons :link (property-trim :link))
(cons :name (property-trim :name))
(cons :parent (property-trim :parent))
(cons :status (property-trim :status))
(cons :tags (property-split :keywords))
(cons :title (property-trim :title))
(cons :type (property-trim :type))
(cons :content (org-no-properties (condition-case nil
(org-export-as-html nil nil nil 'string t nil)
(wrong-number-of-arguments
(org-export-as-html nil nil 'string t nil))))))
'(lambda (a b)
(string< (car a) (car b))))))))
org-blog-buffer-extract-post
starts off with what may actually be a
bit of superfluous code—I know that org-export-as-html
calls
save-excursion
, so it might not actually be necessary for us to do it.
But I'd rather be safe. The same is true for the save-restriction
.
We then make sure that the exporter will pick up our custom properties
by adding them to org-export-inbuffer-options-extra
, and we set a
number of items that describe things about what the export will end up
including and/or how particular items will look. In fact, these should
all be override-able for an individual post by using the #+OPTIONS
property—these are just the defaults that I think are sane.
Then the magic happens.
If you're not used to a very functional style of programming, this code
may be a little confusing—all the action is really happening down at
the bottom of the function, where org-export-as-html
is being called.
In fact, if I'm truthful, I'm vaguely amazed it works at all.
See, when org-export-as-html
gets run, in addition to returning the
document transformed into HTML, it places a bunch of meta-data in the
org-infile-property-plist
. Our function property-trim
is a wrapper
for pulling values out of that list and removing any leading spaces:
(defun property-trim (k)
"Get a property value trimmed of leading spaces."
(let ((v (plist-get (org-infile-export-plist) k)))
(when v
(replace-regexp-in-string "^[[:space:]]+" "" v))))
We run that across most of the property items to get a good value. We
also have a variant, property-split
, that will split a value on
commas, returning a list:
(defun property-split (k)
"Get a property value trimmed of leading spaces and split on commas."
(let ((v (property-trim k)))
(when v
(split-string v "\\( *, *\\)" t))))
This is used in possibly multi-valued fields, as for tags or categories.
If you look closely, you can see org-export-as-html
getting run in
order to provide the value for the :content
field. But looking at the
code again—and this is some of the first code I wrote—I don't know
how that is guaranteed to happen before everything else starts looking
at the property list items.
Perhaps it will all become clearer (and less side-effect-y) with the new exporter.
Anyway, time to write a test or two. We'll begin by extracting a post structure from an empty buffer:
(ert-deftest ob-test-extract-from-empty ()
"Try extracting a post from an empty buffer."
(with-temp-buffer
(should (equal (org-blog-buffer-extract-post) '((:blog)
(:category)
(:content . "\n")
(:date)
(:excerpt)
(:id)
(:link)
(:name)
(:parent)
(:status)
(:tags)
(:title)
(:type))))))
As we would expect, we end up with an alist that is basically devoid of values, except for the content, which is pretty darn bare. In fact, down the road, we will probably do some more massaging of content that will change even that, but we test against what we have now.
Then we build a test that actually extracts some content. Including
(ert-deftest ob-test-extract-from-empty ()
"Try extracting a post from an empty buffer."
(with-temp-buffer
(insert "\
#+POST_BLOG: t1b
#+POST_CATEGORY: t1c1, t1c2
#+DATE: [2013-01-25 Fri 00:00]
#+DESCRIPTION: t1e
#+POST_ID: 1
#+POST_LINK: http://example.com/
#+POST_NAME: t1n
#+KEYWORDS: t1k1, t1k2, t1k3
#+TITLE: Test 1 Title
#+POST_TYPE: post
Just a little bit of content.")
(should (equal (org-blog-buffer-extract-post) '((:blog . "t1b")
(:category "t1c1" "t1c2")
(:content . "\n\n<p>Just a little bit of content\n</p>")
(:date (20738 4432))
(:excerpt . "t1e")
(:id . "1")
(:link . "http://example.com/")
(:name . "t1n")
(:parent)
(:status)
(:tags "t1k1" "t1k2" "t1k3")
(:title . "Test 1 Title")
(:type . "post"))))))
And that's it for now. Next time we'll look at the process of merging a post structure back into a buffer. Once we have our two-way transformation capability, the world is our mollusk. Well, once we have that and a little XML-RPC code.