Skip to content

jilinhuang/meta

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MeTA: ModErn Text Analysis

Please visit our web page for information about MeTA!

Overview

MeTA is a modern C++ data sciences toolkit featuring

  • text tokenization, including deep semantic features like parse trees

  • inverted and forward indexes with compression and various caching strategies

  • various ranking functions for the indexes

  • topic modeling algorithms

  • language modeling algorithms

  • clustering and similarity algorithms

  • classification algorithms

  • wrappers for liblinear and slda

Doxygen documentation can be found here. Note that this is probably not as frequently updated as it should be.

Our current goal for MeTA is to publish in JMLR's Machine Learning Open-Source Software.

Build Status (by branch)

  • master: Build Status
  • develop: Build Status

Project setup

  • This project requires a very well conforming C++11 compiler. Currently, clang is the de-facto compiler for use with this project

  • Additionally, you will need a conformant implementation of the C++11 standard library and ABI---currently libc++ and libc++abi are the best options for this. See your distribution's package manager for more information on installing these dependencies.

  • Windows users: YMMV. It is not currently supported, but things may work. You will likely need Visual Studio 2013 for the C++11 features.

  • This project makes use of several git submodules. To initialize these, run

git submodule init
git submodule update
  • Once the submodules are instantiated, go to deps/libsvm-modules and run make in the liblinear and libsvm directories if you plan on using the svm_wrapper class.

  • To compile initially, run the following commands

mkdir build
cd build
# omit CXX=clang++ if you want to use your default compiler
CXX=clang++ cmake ../ -DCMAKE_BUILD_TYPE=Debug
make
  • There are rules for clean, tidy, and doc. (Also, once you run the cmake command once, you should be able to just run make like usual as you're developing---it'll detect when the CMakeLists.txt file has changed and rebuild Makefiles if it needs to.)

About

A Modern C++ Data Sciences Toolkit

Resources

License

MIT, NCSA licenses found

Licenses found

MIT
LICENSE.mit
NCSA
LICENSE.ncsa

Stars

Watchers

Forks

Packages

No packages published