# Introduction

Groff is a ancient but powerful typesetting system that creates formatted output when given plain text mixed with formatting commands. By using the Mom macroset, one can compose PDF file from plain-text file easily.

 1  groff -mom input.mom -Tpdf > output.pdf 

input.mom

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23  .PRINTSTYLE TYPESET .TAB_SET 1 0 5P .TAB_SET 2 6P 38P .TAB_SET 3 22P 22P .START .MCO .TAB 1 .FONT B Intro .MCR .TAB 2 .FONT R Some introductory sentence... .MCX .MCO .TAB 1 .FONT B Content .MCR .TAB 2 .FONT R Formal content starts from here... .MCX 

output.pdf

Neat!

# My Situation

The problem for me is I need to write some Chinese document, which is a rare use case since I didn’t see any one talked about it over the search engines. After some digging, trial and error, I finally figured out the utimate solution.

(groff spit out gibberish for Chinese characters)

# The Solution

## File Encoding

groff reads input file in ASCII encoding by default, but most distros use UTF8 as default text file encoding nowaday, so we can read and write Unicode without any problme at all. Fortunately, we can change this behavious with -Kutf8 option.

 1  groff -Kutf8 -ms input.ms -Tpdf > output.pdf 

## Font

Then, I received some complaining from groff.

 1 2 3 4 5 6  troff: input.ms:12: warning: can't find special character 'u4E00' troff: input.ms:12: warning: can't find special character 'u6BB5' troff: input.ms:12: warning: can't find special character 'u4E2D' troff: input.ms:12: warning: can't find special character 'u6587' troff: input.ms:12: warning: can't find special character 'u63CF' troff: input.ms:12: warning: can't find special character 'u8FF0' 

That meant groff can now read input file correctly, but failed to find the Glyph from Font file. That was weird to me because I definitely had some Chinese Fonts installed on my system. Turned out groff uses its own font library, which certainly doesn’t contain any CJK glyph judged by their file sizes.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49  /usr/share/groff/current/font/devps total 604K -rw-r--r-- 1 root root 11K Mar 21 2020 AB -rw-r--r-- 1 root root 13K Mar 21 2020 ABI -rw-r--r-- 1 root root 13K Mar 21 2020 AI -rw-r--r-- 1 root root 11K Mar 21 2020 AR -rw-r--r-- 1 root root 11K Mar 21 2020 BMB -rw-r--r-- 1 root root 13K Mar 21 2020 BMBI -rw-r--r-- 1 root root 12K Mar 21 2020 BMI -rw-r--r-- 1 root root 9.9K Mar 21 2020 BMR -rw-r--r-- 1 root root 6.4K Mar 21 2020 CB -rw-r--r-- 1 root root 8.6K Mar 21 2020 CBI -rw-r--r-- 1 root root 8.6K Mar 21 2020 CI -rw-r--r-- 1 root root 6.3K Mar 21 2020 CR -rw-r--r-- 1 root root 200 Mar 21 2020 DESC -rw-r--r-- 1 root root 141 Mar 21 2020 download -rw-r--r-- 1 root root 1.2K Mar 21 2020 EURO -rw-r--r-- 1 root root 1.5K Mar 21 2020 freeeuro.afm -rw-r--r-- 1 root root 21K Mar 21 2020 freeeuro.pfa drwxr-xr-x 2 root root 4.0K Aug 1 00:29 generate/ -rw-r--r-- 1 root root 18K Mar 21 2020 HB -rw-r--r-- 1 root root 20K Mar 21 2020 HBI -rw-r--r-- 1 root root 21K Mar 21 2020 HI -rw-r--r-- 1 root root 18K Mar 21 2020 HNB -rw-r--r-- 1 root root 20K Mar 21 2020 HNBI -rw-r--r-- 1 root root 21K Mar 21 2020 HNI -rw-r--r-- 1 root root 19K Mar 21 2020 HNR -rw-r--r-- 1 root root 19K Mar 21 2020 HR -rw-r--r-- 1 root root 13K Mar 21 2020 NB -rw-r--r-- 1 root root 20K Mar 21 2020 NBI -rw-r--r-- 1 root root 17K Mar 21 2020 NI -rw-r--r-- 1 root root 15K Mar 21 2020 NR -rw-r--r-- 1 root root 11K Mar 21 2020 PB -rw-r--r-- 1 root root 13K Mar 21 2020 PBI -rw-r--r-- 1 root root 13K Mar 21 2020 PI -rw-r--r-- 1 root root 11K Mar 21 2020 PR -rw-r--r-- 1 root root 3.0K Mar 21 2020 prologue -rw-r--r-- 1 root root 6.2K Mar 21 2020 S -rw-r--r-- 1 root root 7.9K Mar 21 2020 SS -rw-r--r-- 1 root root 606 Mar 21 2020 symbolsl.pfa -rw-r--r-- 1 root root 8.9K Mar 21 2020 TB -rw-r--r-- 1 root root 11K Mar 21 2020 TBI -rw-r--r-- 1 root root 2.5K Mar 21 2020 text.enc -rw-r--r-- 1 root root 11K Mar 21 2020 TI -rw-r--r-- 1 root root 8.8K Mar 21 2020 TR -rw-r--r-- 1 root root 4.3K Mar 21 2020 zapfdr.pfa -rw-r--r-- 1 root root 15K Mar 21 2020 ZCMI -rw-r--r-- 1 root root 5.4K Mar 21 2020 ZD -rw-r--r-- 1 root root 5.4K Mar 21 2020 ZDR 

In this output, AR, AB, AI and ABI stand for Arial Regular, Arial Bold, Arial Italic, and Arial BoldItalic. We can use them like this:

  1 2 3 4 5 6 7 8 9 10 11  .FAMILY A .FONT R Regular Arial Text .FONT B Bolded Regular Arial Text .FONT I Italic Regular Arial Text .FONT BI BoldItalic Regular Arial Text .FONT R Go back to Regular 

The font format that groff uses is also obsolete, it’s hard to find font in such format. So, we need (font-forge)[https://fontforge.org/] to help us converting TrueType/OpenType fonts to something groff can understand.

I decided to try adding (Sarasa-Mono-SC)[https://github.com/be5invis/Sarasa-Gothic] into groff font library. So I downloaded and extracted following files:

 1 2 3 4  -rw-rw-r-- 1 klesh klesh 23M Nov 21 19:26 sarasa-mono-sc-bolditalic.ttf -rw-rw-r-- 1 klesh klesh 22M Nov 21 19:26 sarasa-mono-sc-bold.ttf -rw-rw-r-- 1 klesh klesh 23M Nov 21 19:26 sarasa-mono-sc-italic.ttf -rw-rw-r-- 1 klesh klesh 23M Nov 26 16:27 sarasa-mono-sc-regular.ttf 

### Conversion

In general, we have to convert one TTF to two files:

1. groff font file, for sarasa-mono-sc-regular.ttf, it should be something like SarasaMonoSCR. Then we can refer this font as SarasaMonoSC in our groff source file.
2. PS Type 1 or Type 42 (end width .pfa and .t42 accordingly). I would suggest Type 42 format which has a smaller file size, leads to a faster compilation.

Step by step:

• Open our CJK font file sarasa-mono-sc-regular.ttf with font forge
• Go to File -> Generate Fonts, Select PS Type 1 (Ascii), make sure Output AFM is checked in Options Dialog
• Click on Generate button, we will get 2 new files, Sarasa-Mono-SC-Regular.pfa and Sarasa-Mono-SC-Regular.afm
• Change output type to Type 42 and generate a file named Sarasa-Mono-SC-Regular.t42
• Generate groff font file from .afm file
 1  afmtodit Sarasa-Mono-SC-Regular "/usr/share/groff/current/font/devps/generate/textmap" SarasaMonoSCR 

### Installation

In general, we have to copy genrated files to groff search folder and register them as downloadable font file.

Step by step:

• Copy SarasaMonoSCR and Sarasa-Mono-SC-Regular.t42 to /usr/share/groff/site-font/devps
• Add following line to /usr/share/groff/current/font/devps/download, replace <TAB> with Tab… you know what I mean..
 1 2  ... Sarasa-Mono-SC-RegularSarasa-Mono-SC-Regular.pfa 

With that we can now make groff embed our font when outputing a postscript file

Now, in order to make it work when outputing a PDF file with -Tpdf option, we also need to register the font for devpdf, but we are not going to do that. Because with this approach, we are going to receive a huge PDF with all used fonts embedded. A better way is to use ps2pdf to create a optimized PDF file:

 1 2  groff -mom -Kutf8 input.mom > tmp.ps ps2pdf tmp.ps output.pdf 

### Other styles

If your font came with all styles like Sarasa did, then you are in luck, convert and install them and you are good to go. Some font may provides Regular version only, then you need to generate other styles yourselves. fontforge has this Change Weight and Italic/Oblique commands under Element->Style menu can help to do that. Use Edit->Select to highlight your characters and then apply those style commands you need.

### Live preview

Live preview is helpful when we composing complex layout. First, we need a auto-compilation mechanism, So I add some script to my .vimrc to enter a auto-compile mode:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  fu! SilentOK(cmd) let l:ouput = system(substitute(a:cmd, "%", expand("%"), "g")) if v:shell_error != 0 echo ouput endif endfu fu! ToggleGroffMomAutoCompilation() let g:GroffMomAutoCompilation = !get(g:, "GroffMomAutoCompilation", 0) augroup GroffPdf autocmd! augroup END if g:GroffMomAutoCompilation augroup GroffPdf autocmd BufWritePost,FileWritePost *.mom :call SilentOK("groff -Kutf8 -mom % > /tmp/tmp.ps && ps2pdf /tmp/tmp.ps %.pdf") augroup END echo "Auto compilation for groff_mom enabled" else echo "Auto compilation for groff_mom disabled" endif endfu nnoremap ac :call ToggleGroffMomAutoCompilation() " syntax highlighting: https://github.com/vim-scripts/mom.vim autocmd BufEnter,BufRead *.mom :set ft=mom 

Then, we can launch any PDF viewer we like to do the previewing part. I would suggestion to use something like zathura which can auto reload when file changed. Or you can take a look at entr to help you automate the reloading if your viewer of choice doesn’t support autoreloading.

Another problem I ran into was auto-compilation took too long to complete, the reason was Sarasa fonts are huge, each style could took up to 40M in size. When multiple styles were used, the outputed .ps file could grew over 100M easily. So I finally opt for custom font consists of DroidSansFallback + URW Gothic Bookman + FontAWesome.

If you just want to get a taste of it, WenQuenYiMicroHei would be a better choice. Most distros have this font on their official repository, but its Korean part might be broken, manifests as characters stack together. Fortunately, Ubuntu fixed this problem, for other distros, we could download the deb package and extract the font file out of it.

### Usage

Finally, I can now compose PDF file happily.

input.mom

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  .PRINTSTYLE TYPESET .FAMILY SarasaMonoSC .TAB_SET 1 0 5P .TAB_SET 2 6P 38P .TAB_SET 3 22P 22P .START .MCO .TAB 1 .FONT B 简介 .MCR .TAB 2 .FONT R 这是一段中文 .MCX .MCO .TAB 1 .FONT B 正文 .MCR .TAB 2 .FONT R 内容从这里开始 .MCX 

output.pdf

# Take away

In order to avoid redundant labor, I created couple of scripts to automate the process:

• unttc: extracts ttf files from ttc file
• generate_fontstyle: geneates Bold/Italic versions out of base on Regular version
• groff_ttf.sh: adds ttf to groff font library