Skip to content

GreengageDB/madlib

 
 

Repository files navigation

MADlib® is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data.

Build Status

Installation and Contribution

See the project website MADlib Home for links to the latest binary and source packages.

We appreciate all forms of project contributions to MADlib including bug reports, providing help to new users, documentation, or code patches. Please refer to Contribution Guidelines for instructions.

For more installation and contribution guides, please refer to the MADlib Wiki.

Compiling from source on Linux details are also on the wiki.

Detailed build instructions are available in ReadMe_Build.txt

User and Developer Documentation

The latest documentation of MADlib modules can be found at MADlib Docs.

Architecture

The following block-diagram gives a high-level overview of MADlib's architecture.

MADlib Architecture

Third Party Components

MADlib incorporates software from the following third-party components. Bundled with source code:

  1. libstemmer "small string processing language"
  2. m_widen_init "allows compilation with recent versions of gcc with runtime dependencies from earlier versions of libstdc++"
  3. argparse 1.2.1 "provides an easy, declarative interface for creating command line tools"
  4. PyYAML 3.10 "YAML parser and emitter for Python"
  5. UseLATEX.cmake "CMAKE commands to use the LaTeX compiler"

Downloaded at build time (or supplied as build dependencies):

  1. Boost 1.61.0 (or newer) "provides peer-reviewed portable C++ source libraries"
  2. Eigen 3.2.2 "C++ template library for linear algebra"

Licensing

Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this project to You under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at LICENSE.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

As specified in LICENSE additional license information regarding included third-party libraries can be found inside the licenses directory.

Release Notes

Changes between MADlib versions are described in the ReleaseNotes.txt file.

Papers and Talks

Related Software

  • PivotalR - PivotalR also lets the user run the functions of the open-source big-data machine learning package MADlib directly from R.
  • PyMADlib - PyMADlib is a python wrapper for MADlib, which brings you the power and flexibility of python with the number crunching power of MADlib.

About

Greengage MADlib (based on Apache MADlib)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 44.1%
  • C 41.9%
  • Python 7.5%
  • CMake 5.4%
  • PLpgSQL 0.5%
  • Shell 0.5%
  • Other 0.1%