Skip to content
FrancisBond edited this page Jun 9, 2006 · 17 revisions

Norwegian Japanese Machine Translation "NoJa"

Disclaimer: These pages are meant to be helpful, but that doesn't mean the authors will always be prepared to helpfully answer questions. TableOfContents

Running No/Ja

We recomment you have at least 3GB of RAM. Even more memory wouldn't hurt.

  1. start transfer and generation in one emacs (M-x noja)

  2. start a generator server

    • trollet
    • load ~/logon/dfki/jacy/lkb/script
    • index for generator
    • start server
  3. translate

    • (mt::parse-interactively "overset meg") (C-c r)

    • a parse window appears

    • select the parse (with next/previous) and click transfer

    • select the translation (with next/previous) and click generate

    • the translation should magically popup in a little window

You can also run it as a [wiki:LogonProcessing/BatchTranslation batch] with a bit more undocumented setup.

Set Up

Not all of the bits have been publically released yet (2006-06-09).

  • get a recent LOGON CVS
    • fulfill your licensing requirements
      • put your ACL licence in
      • delete any bits you shouldn't have
  • add in norsource
  • add in noja
  • setup various files
    • .bashrc
    • .emacs
    • logon/dot.tsbrc

.bashrc

LOGONROOT=~/logon
if [ -f ${LOGONROOT}/dot.bashrc ]; then
    . ${LOGONROOT}/dot.bashrc
fi

.emacs

;;;
;;; LOGON-specific settings

(defun log ()
  (interactive)
  (if (getenv "LOGONROOT")
      (let ((logon (substitute-in-file-name "$LOGONROOT")))
        (if (file-exists-p (format "%s/dot.emacs" logon))
            (load (format "%s/dot.emacs" logon) nil t t)))))

(defun jacy ()
  (interactive)
  ;; set up logon
  (log)
  ;; load lisp
  (lisp)
  ;; make the encoding suitable for japanese (EUC-JP)
  (japanese)
  ;; load the common-lisp commands
  (insert (format ":ld %s/dot.clinit.cl\n" logon-root))
  (fi:inferior-lisp-newline)
  ;; load the machine translation controller
  (fi:eval-in-lisp "(lmt)")
  ;; load the tsdb settings
  (insert (format ":ld %s/dot.tsdbrc\n" logon-root))
  (fi:inferior-lisp-newline)
  ;;set tsdb home and skeleton home
  (insert "(tsdb::tsdb :home \"/home/bond/treebank/mrs\")")
  (fi:inferior-lisp-newline)
  (insert (format 
           "(tsdb::tsdb :skeleton \"%s/dfki/jacy/tsdb/skeletons\")"
           logon-root))
  (fi:inferior-lisp-newline)
  ;; load the grammar
  (insert 
   (format "(read-script-file-aux  \"%s/dfki/jacy/lkb/script\")" 
           logon-root))
  (fi:inferior-lisp-newline))


(defun norse ()
  (interactive)
  ;; set up logon
  (log)
  ;; load lisp
  (lisp)
  ;; load the common-lisp commands
  (insert (format ":ld %s/dot.clinit.cl\n" logon-root))
  (fi:inferior-lisp-newline)
  ;; load the machine translation controller
  (fi:eval-in-lisp "(lmt)")
  ;; load the tsdb settings
  (insert (format ":ld %s/dot.tsdbrc\n" logon-root))
  (fi:inferior-lisp-newline)
  ;;set tsdb home and skeleton home
  (insert "(tsdb::tsdb :home \"/home/bond/treebank/norse\")")
  (fi:inferior-lisp-newline)
  (insert (format 
           "(tsdb::tsdb :skeleton \"%s/ntnu/norsource/tsdb/skeletons\")"
           logon-root))
  (fi:inferior-lisp-newline)
  ;; load the grammar
  (insert 
   (format "(read-script-file-aux  \"%s/ntnu/norsource/lkb/scribet\")" 
           logon-root))
  (fi:inferior-lisp-newline))

(defun noja ()
  (interactive)
  ;; set up logon
  (log)
  ;; load lisp
  (lisp)
  ;; load the common-lisp commands
  (insert (format ":ld %s/dot.clinit.cl\n" logon-root))
  (fi:inferior-lisp-newline)
  ;; load the machine translation controller
  (fi:eval-in-lisp "(lmt)")
  ;; load the tsdb settings
  (insert (format ":ld %s/dot.tsdbrc\n" logon-root))
  (fi:inferior-lisp-newline)
  ;; load the parser
  (insert "(tsdb:tsdb :cpu :norse-parse :file t)")
  (fi:inferior-lisp-newline)
  ;; load the transfer grammar
   (insert 
    (format "(read-script-file-aux  \"%s/ntnu/noja/lkb/script\")" logon-root))
   (fi:inferior-lisp-newline))

logon/dot.tsdbrc

       ;;;
       ;;; for NoJa (Norsource/Jacy)
       ;;; 
       (make-cpu 
        :host (short-site-name)
        :spawn binary
        :options (list "-I" base "-qq" "-locale" "no_NO.UTF-8" 
                       "-L" (format nil "~a/ntnu/norse-parse.lisp" %logon%))
        :class :norse-parse :name "norse-parse" :grammar "Norsource"
        :task '(:parse) :wait wait :quantum quantum)       

logon/ntnu/norse-parse.lisp

(in-package :common-lisp-user)
;;
;; make sure we have enough space available
;;
(system:resize-areas :old 256 :new 256)
(let* ((logon (system:getenv "LOGONROOT"))
       (lingo (namestring (parse-namestring (format nil "~a/lingo" logon)))))
  ;;
  ;; load MK defsystem() and LinGO load-up library first
  ;;
  (load (format nil "~a/lingo/lkb/src/general/loadup" logon))
  ;;
  ;; for NorSource, we need (close to) the full scoop
  ;;
  (pushnew :lkb *features*)
  (pushnew :mrs *features*)
  (pushnew :tsdb *features*)
  (pushnew :logon *features*)
  (pushnew :slave *features*)
  (excl:tenuring 
   (funcall (intern "COMPILE-SYSTEM" :make) "tsdb")
   (funcall 
    (intern "READ-SCRIPT-FILE-AUX" :lkb)
    (format nil "~a/ntnu/norsource/lkb/scribet" logon)))
  (set (intern "*MAXIMUM-NUMBER-OF-EDGES*" :lkb) 10000)
  (excl:gc :tenure) (excl:gc) (excl:gc t) (excl:gc)
  (setf (sys:gsgc-parameter :auto-step) nil)
  (set (intern "*TSDB-SEMANTIX-HOOK*" :tsdb) "mrs::get-mrs-string")
  (funcall (symbol-function (find-symbol "SLAVE" :tsdb))))

To Do

Gain the admiration and respect of many by:

  • running things from the DELPH-IN CVS
  • running the parser using a cheap client instead of lisp
  • setting up the tsdb cpus as a list append...
Clone this wiki locally