Basti's Scratchpad on the Internet

Posts tagged "org-mode":

02 Apr 2018

Scheduling Future TODOs in org-journal

I keep a simple journal in org-journal: One text file per day, in org-mode. But over the years, org-journal has grown somewhat beyond this simple use case. About three years ago, a gentleman named Vladimir Kazanov implemented a very fast text search. Thus, my journal became an information archive. About two years ago, org-journal learned to carry over TODO items to the current day if you hadn't completed them on the previous day. So it to become a to-do list. And today, org-journal gained the ability to work with future journal entries, thus becoming a calendar.

Despite all of these features however, org-journal remains one org file per day, with fancy functions to do fancy things if you want them—or ignore them, if the journal is all you need.

Back to scheduling: This work was prompted by my colleague, who organizes everything in org-mode, but is not a user of org-journal. He even eschews the use of a traditional calendar, and instead uses a few org files and the magic of org-agenda to give him a nice overview like this for the coming week[1]:

Week-agenda (W14):
Monday     2 April 2018 W14
  2018-04-02: Easter Monday 
Tuesday    3 April 2018
Wednesday  4 April 2018
Thursday   5 April 2018
  2018-04-05: Scheduled:  Give Lecture 4 on Applied Programming             :BB:
Friday     6 April 2018
  2018-04-06: Scheduled:  Release of new Eels record
Saturday   7 April 2018
Sunday     8 April 2018
  2018-04-08: Scheduled:  TODO Celebrate Sunday

And lo and behold, this now works in org-journal as well! Just create a new journal entry in the future, either by pressing i j in M-x calendar or by calling org-journal-new-scheduled-entry, and org-journal will create an entry with a SCHEDULED property of the appropriate date (prefix to suppress TODO). When the current day reaches that entry, it will incorporate it into the daily journal.

Future journal entries are highlighted in M-x calendar, and you can get an overview of them with org-journal-schedule-view, or, if you enable org-journal-enable-agenda-integration, through the ordinary org-agenda, as shown above. The agenda integration does not include past journal entries in the agenda, since agenda searches tend to become very slow if they have to traverse the hundreds of files in my journal.

[1]: This is of course not his calendar, but mine.

Tags: org-journal org-mode

Speeding up org-static-blog

Three years ago, I had enough of all the static site generators out there. Over the life of this blog, I had used Octopress, then Pelican, then Coleslaw, then org-mode, and then wrote my own static site generator, org-static-blog. Above all, org-static-blog is simple. It iterates over all *.org files in org-static-blog-posts-directory, and then exports all of these files to HTML. Simple is good. Simple is reliable. Simple means I can fix things.

However, simple can also mean inefficient. Most glaringly, org-static-blog exports every single blog post three times every time you publish: Once to render the HTML, then once to render the RSS feed, then once to render the Index and Archive pages.

Today, I finally tackled this problem: Now, org-static-blog only exports each post once, when the *.org file changes. The RSS feed, the Index page, and the Archive page simply read the already-rendered HTML instead of exporting again.

Thus, a full rebuild of this blog and all of its 85 posts used to take 2:12 min, and now takes 42 s. More importantly, if only one org file changed, the rebuild used to take 1:08 min, and now takes 1.5 s. Things like this are hugely satisfying to me!

Tags: org-mode emacs blog

Org Mode Selective Section Numbering

This is the third revision of a post about selective headline numbering in Org mode. On its own, Org mode can either number all headlines, or none. For scientific writing, this is a non-starter. In a scientific paper, the abstract should not be numbered, the main body should be numbered, and appendices should not be numbered.

In LaTeX, this is easy to do: \section{} creates a numbered headline, while \section*{} creates an unnumbered section. Org mode does not have any facility to control this on a per-headline basis, but it can be taught:

(defun headline-numbering-filter (data backend info)
  "No numbering in headlines that have a property :numbers: no"
  (let* ((beg (next-property-change 0 data))
         (headline (if beg (get-text-property beg :parent data))))
    (if (and (eq backend 'latex)
         (string= (org-element-property :NUMBERS headline) "no"))
        (replace-regexp-in-string
         "\\(part\\|chapter\\|\\(?:sub\\)*section\\|\\(?:sub\\)?paragraph\\)"
         "\\1*" data nil nil 1)
      data)))

(setq org-export-filter-headline-functions '(headline-numbering-filter))

This creates a filter (an Org mode convention similar to a hook), which appends the asterisk to LaTeX headlines if the headline has a property :NUMBERS: no. If all you do is export to LaTeX, this works well.

If you need to export to HTML as well, things get more complicated. Since HTML does not have native numbering support, Org is forced to manually create section numbers. But times have changed, and with CSS3, HTML now indeed does support native numbering!

Here is some CSS that uses CSS3 counters to number all headlines and hide Org's numbers:

/* hide Org-mode's section numbers */
span.section-number-2 { display: none; }
span.section-number-3 { display: none; }
span.section-number-4 { display: none; }
span.section-number-5 { display: none; }
span.section-number-6 { display: none; }

/* define counters for the different headline levels */
h1 { counter-reset: section; }
h2 { counter-reset: subsection; }
h3 { counter-reset: subsubsection; }
h4 { counter-reset: paragraph; }
h5 { counter-reset: subparagraph; }

/* prepend section numbers before headlines */
h2::before {
    content: counter(section) " ";
    counter-increment: section;
}
h3::before {
    content: counter(section) "." counter(subsection) " ";
    counter-increment: subsection;
}
h4::before {
    content: counter(section) "." counter(subsection) "." counter(subsubsection) " ";
    counter-increment: subsubsection;
}
h5::before {
    content: counter(section) "." counter(subsection) "." counter(subsubsection) "." counter(paragraph) " ";
    counter-increment: paragraph;
}
h6::before {
    content: counter(section) "." counter(subsection) "." counter(subsubsection) "." counter(paragraph) "." counter(subparagraph) " ";
    counter-increment: subparagraph;
}

/* suppress numbering for headlines with class="nonumber" */
.nonumber::before { content: none; }

With this in place, we can extend the previous filter to work for HTML as well as LaTeX:

(defun headline-numbering-filter (data backend info)
  "No numbering in headlines that have a property :numbers: no"
  (let* ((beg (next-property-change 0 data))
         (headline (if beg (get-text-property beg :parent data))))
    (if (string= (org-element-property :NUMBERS headline) "no")
        (cond ((eq backend 'latex)
               (replace-regexp-in-string
                "\\(part\\|chapter\\|\\(?:sub\\)*section\\|\\(?:sub\\)?paragraph\\)"
                "\\1*" data nil nil 1))
              ((eq backend 'html)
               (replace-regexp-in-string
                "\\(<h[1-6]\\)\\([^>]*>\\)"
                "\\1 class=\"nonumber\"\\2" data nil nil)))
      data)))

(setq org-export-filter-headline-functions '(headline-numbering-filter))

Previously, I implemented this in Org mode only (no CSS). While that worked as well, it required the modification of some fairly low-level Org functions. The CSS-based solution is much simpler, and should be much easier to maintain and adapt.

Tags: org-mode emacs

Writing a Thesis in Org Mode

Most of my peers write all their scientific documents in LaTeX. Being a true believer in the power of Emacs, I opted for writing my master's thesis in Org Mode instead. Here's my thoughts on this process and how it compares to the usual LaTeX work flow.

In my area of study, a thesis is a document of about 60 pages that contains numerous figures, math, citations, and the occasional table or source code snippet. Figures are usually graphs that are generated in some programming environment and creating those graphis is a substantial part of writing the thesis.

Org mode was a huge help in this regard, since it combines the document text and the executable pieces of code. Instead of having a bunch of scripts that generate graphs, and a bunch of LaTeX files that include those graphs, I had one big Org file that included both the thesis text and the graphing code.

As for the thesis text, I used Org's export functionality to convert the Org source to LaTeX, and compiled a PDF from there. This really works very well: It is very nice to use Org headlines instead of \section{...}, and clickable Org links instead of \ref{...}. While this is nice, it is just a change of syntax. I still had to enter the very same things and saving a few characters is not particularly impressive. For example, figures still require a caption, an ID, and a size:

#+CAPTION: Modulation tracks of a clarinet recording with and without white noise. The modulation tracks are not normalized.
#+ATTR_LATEX: :width 6in :height 2.5in :float multicolumn
#+NAME: fig:summary_tracks
[[file:images/summary_tracks.pdf]]

In LaTeX, this would be

\begin{figure*}
\centering
\includegraphics[width=6in,height=2.5in]{images/summary_tracks.pdf}
\caption{\label{fig:summary_tracks}Modulation tracks of a clarinet recording with and without white noise. The modulation tracks are not normalized.}
\end{figure*}

As you can see, there really is not that much of a difference between these two, and you might even consider the LaTeX example more readable. In some other areas, Org mode is simply lacking features: Org does not have any syntax for page formatting, and thus can't create a perfectly formatted title page. Similarly, it can't do un-numbered sections, and it can't do numbered equations. For all of those, I had to fall back to writing LaTeX. This is not a big deal, but it breaks the abstraction.

A bigger problem is that Org documents include all the chapters in one big file. While Org can deal with large files no problem, it means that LaTeX compiles take a while. In LaTeX, I would have split my document into a number of smaller files that could be separately compiled in order to keep compilation time down. This is confounded by Org's default behavior of deleting intermediate LaTeX files, which forces a full triple-recompile on each export. At the end of my thesis, a full export took about 15 seconds. Not a deal-breaker, but annoying.

The one thing where Org really shines, though, is the inclusion of code fragments: Most of my figures were created in Python, and Org mode allowed me to include that Python code right in my document. Hit C-c C-c on any code fragment, and Org ran that code and created a new image file that is automatically included as a figure. This was really tremendously useful!

At the end of the day, I am not sure whether Org mode is the right tool for writing a thesis. It worked fine, but there were a lot of edge cases and workarounds, which made the whole process a bit uncomfortable. The only really strong argument in favor of Org is the way it can include both code and prose in the same document. But maybe a similar thing could be implemented with LaTeX and some literate programming tool.

Tags: org-mode emacs

Org Mode Citation Links

I am writing my master's thesis in Org Mode, and export to LaTeX for publishing. For the most part, this works incredibly well. Using Org Mode instead of plain LaTeX means no more fiddly \backslash{curly brace} all over the place. No more scattering code fragments and markup across hundreds of files. And on top of that, deep integration with my research notes and task tracking system.

But not everything is perfect. For one thing, citations do not work well. Sure, you can always write \cite{cohen93}, but then you are writing LaTeX again. Also, all the other references and footnotes are clickable, highlighted Org Mode links, but \cite{cohen93} is just inline LaTeX.

But luckily, this is Emacs, and Emacs is programmable. And better yet, Org Mode has just the tool for the job:

(org-add-link-type "cite"
     (defun follow-cite (name)
       "Open bibliography and jump to appropriate entry.
        The document must contain \bibliography{filename} somewhere
        for this to work"
       (find-file-other-window
        (save-excursion
          (beginning-of-buffer)
          (save-match-data
            (re-search-forward "\\\\bibliography{\\([^}]+\\)}")
            (concat (match-string 1) ".bib"))))
       (beginning-of-buffer)
       (search-forward name))
     (defun export-cite (path desc format)
       "Export [[cite:cohen93]] as \cite{cohen93} in LaTeX."
       (if (eq format 'latex)
           (if (or (not desc) (equal 0 (search "cite:" desc)))
               (format "\\cite{%s}" path)
             (format "\\cite[%s]{%s}" desc path)))))

This registers a new link type in Org Mode: [[cite:cohen93]], which will jump to the appropriate bibliography entry when clicked, and get exported as \cite{cohen93} in LaTeX. Awesome!

Tags: org-mode emacs

Speeding Up Org Mode Publishing

I use org-mode to write my blog, and org-publish as my static site generator. While this system works great, I have found it to be really really slow. At this point, my blog has 39 posts, and org-publish will take upwards of a minute to re-generate all of them. To make matters worse, my workflow usually involves several re-generations per post. This gets old pretty quickly.

Since I am on a long train ride today, I decided to have a go at this problem. By the way, train rides and hacking on Emacs are a perfect match: Internet connectivity on trains is usually terrible, but Emacs is self-documenting, so internet access doesn't matter as much. It is sobering to work without an internet connection every once in a while, and Emacs is a perfect target for this kind of work.

One of the many things I learned on train rides is that Emacs in fact contains its own profiler! So, I ran (progn (profiler-start 'cpu) (org-publish "blog") (profiler-report)) to get a hierarchical list of where org-publish was spending its time. Turns out, most of its total run time was spent in functions relating to version control (starting with vc-).

Some package in my configuration set up vc-find-file-hook as part of find-file-hook. This means that every time org-publish opens a file, Emacs will look for the containing git repository and query its status. This takes forever! Worse yet, I don't even use vc-git at all. All my git interaction is done through magit.

But Emacs wouldn't be Emacs if this could not be fixed with a line or two of elisp. (remove-hook 'find-file-hooks 'vc-find-file-hook) will do the trick. This brought the runtime of org-publish down to 15 seconds. Yay for profiling and yay for Emacs!