Skip to content

Commit

Permalink
Merge pull request #4 from anhaidgroup/rel_0.1.x
Browse files Browse the repository at this point in the history
Rel 0.1.x
  • Loading branch information
pavankm authored Jan 12, 2018
2 parents a83a966 + 145d0b3 commit ac3351c
Show file tree
Hide file tree
Showing 20 changed files with 185 additions and 2,197 deletions.
29 changes: 11 additions & 18 deletions conda.recipe/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
package:
name: py_labeler
version: "0.2.0"
version: "0.1.0"

source:
git_url: ../
Expand All @@ -10,31 +10,24 @@ requirements:
build:
- python
- setuptools
- py_stringsimjoin
- cloudpickle
- pyparsing
- scikit-learn
- pyqt
- pandas-profiling
# - xgboost
- pandas
- six
- jinja2

run:
- python
- py_stringsimjoin
- cloudpickle
- pyparsing
- scikit-learn
- pyqt
- pandas-profiling
# - xgboost
- setuptools
- pandas
- six
- jinja2

test:
# Python imports
imports:
- py_entitymatching
- py_labeler

# commands:
# You can put test commands to be run here. Use this to test that the
# You can putls test commands to be run here. Use this to test that the
# entry points work.


Expand All @@ -46,7 +39,7 @@ test:
# - nose

about:
home: https://sites.google.com/site/anhaidgroup/projects/magellan/py_entitymatching
home: https://sites.google.com/site/anhaidgroup/projects/magellan/py_labeler
license: BSD License
summary: 'Python library for entity matching.'

Expand Down
130 changes: 62 additions & 68 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
@@ -1,40 +1,40 @@
.. _contributing:

*********************************
Contributing to magellan_labeler
Contributing to py_labeler
*********************************

.. contents:: Table of contents:
:local:
:local:

This document is adapted from `pandas how to contribute guidelines
<http://pandas.pydata.org/pandas-docs/stable/contributing.html>`_ for
*magellan_labeler* package.
*py_labeler* package.

Where to start?
===============

All contributions, bug reports, bug fixes, documentation improvements,
enhancements and ideas are welcome.

If you are simply looking to start working with the *magellan_labeler* codebase, navigate to the
`GitHub "issues" tab <https://github.com/anhaidgroup/magellan_labeler/issues>`_ and start looking through
If you are simply looking to start working with the *py_labeler* codebase, navigate to the
`GitHub "issues" tab <https://github.com/anhaidgroup/py_labeler/issues>`_ and start looking through
interesting issues.

Or maybe through using *magellan_labeler* you have an idea of your own or are looking for something
Or maybe through using *py_labeler* you have an idea of your own or are looking for something
in the documentation and thinking 'this can be improved'...you can do something
about it!

Feel free to ask questions on the `mailing list
<https://groups.google.com/forum/#!forum/py_entitymatching>`_
<https://groups.google.com/forum/#!forum/py_labeler>`_

Bug reports and enhancement requests
====================================

Bug reports are an important part of making *magellan_labeler* more stable.Having a
Bug reports are an important part of making *py_labeler* more stable.Having a
complete bug report will allow others to reproduce the bug and provide insight into
fixing. We use GitHub issue tracker to track bugs. It is important that you provide the
exact version of *magellan_labeler* where the bug is found. Trying the bug-producing
exact version of *py_labeler* where the bug is found. Trying the bug-producing
code out on the *master* branch is often a worthwhile exercise to confirm the bug still
exists. It is also worth searching existing bug reports and pull requests to see if the
issue has already been reported and/or fixed.
Expand All @@ -51,33 +51,33 @@ Bug reports must:
...
```

#. Include the full version string of *magellan_labeler*. You can find the version as follows::
#. Include the full version string of *py_labeler*. You can find the version as follows::

>>> import magellan_labeler as em
>>> em.__version__
>>> import py_labeler as pl
>>> pl.__version__

#. Explain why the current behavior is wrong/not desired and what you expect instead.


The issue will then show up to the *magellan_labeler* community and be open to
The issue will then show up to the *py_labeler* community and be open to
comments/ideas from others.


Working with the code
=====================

Now that you have an issue you want to fix, enhancement to add, or documentation to
improve, you need to learn how to work with GitHub and the *magellan_labeler* code base.
improve, you need to learn how to work with GitHub and the *py_labeler* code base.

Version control, Git, and GitHub
--------------------------------

To the new user, working with Git is one of the more daunting aspects of contributing
to *magellan_labeler*. It can very quickly become overwhelming, but sticking to the
to *py_labeler*. It can very quickly become overwhelming, but sticking to the
guidelines below will help keep the process straightforward and mostly trouble free.
As always, if you are having difficulties please feel free to ask for help.

The code is hosted on `GitHub <https://www.github.com/anhaidgroup/magellan_labeler>`_. To
The code is hosted on `GitHub <https://www.github.com/anhaidgroup/py_labeler>`_. To
contribute you will need to sign up for a `free GitHub account
<https://github.com/signup/free>`_. We use `Git <http://git-scm.com/>`_ for
version control to allow many people to work together on the project.
Expand All @@ -99,16 +99,16 @@ you can work seamlessly between your local repository and GitHub.
Forking
-------

You will need your own fork to work on the code. Go to the `magellan_labeler project
page <https://github.com/anhaidgroup/magellan_labeler>`_ and hit the ``Fork`` button. You will
You will need your own fork to work on the code. Go to the `py_labeler project
page <https://github.com/anhaidgroup/py_labeler>`_ and hit the ``Fork`` button. You will
want to clone your fork to your machine::

git clone [email protected]:<your-user-name>/magellan_labeler.git <local-repo-name>
git clone [email protected]:<your-user-name>/py_labeler.git <local-repo-name>
cd <local-repo-name>
git remote add upstream git://github.com/anhaidgroup/magellan_labeler.git
git remote add upstream git://github.com/anhaidgroup/py_labeler.git

This creates the directory `local-repo-name` and connects your repository to
the upstream (main project) *magellan_labeler* repository.
the upstream (main project) *py_labeler* repository.

The testing suite will run automatically on Travis-CI once your pull request is
submitted. However, if you wish to run the test suite on a branch prior to
Expand All @@ -131,15 +131,15 @@ The above can be simplified to::

This changes your working directory to the *new_feature* branch. Keep any
changes in this branch specific to one bug or feature so it is clear
what the branch brings to *magellan_labeler*. You can have many new features
what the branch brings to *py_labeler*. You can have many new features
and switch in between them using the git checkout command.

To update this branch, you need to retrieve the changes from the master branch::

git fetch upstream
git rebase upstream/master

This will replay your commits on top of the lastest magellan_labeler git master. If this
This will replay your commits on top of the lastest py_labeler git master. If this
leads to merge conflicts, you must resolve them before submitting your pull
request. If you have uncommitted changes, you will need to ``stash`` them prior
to updating. This will effectively store your changes and they can be reapplied
Expand All @@ -150,54 +150,48 @@ after updating.
Creating a development environment
----------------------------------

An easy way to create a *magellan_labeler* development environment is as follows.
An easy way to create a *py_labeler* development environment is as follows.

- Install either :ref:`Anaconda <install.anaconda>` or :ref:`miniconda <install.miniconda>`
- Install ``virtualenv``
- Make sure that you have :ref:`cloned the repository <contributing.forking>`
- ``cd`` to the *magellan_labeler* source directory
- ``cd`` to the *py_labeler* source directory

Tell conda to create a new environment, named ``magellan_labeler_dev``, or any other
Tell virtualenv to create a new environment, named ``py_labeler_dev``, or any other
name you would like for this environment, by running::

conda create -n magellan_labeler_dev --file requirements.yml
virtualenv py_labeler_dev --python=python3.5 VIRTUALENV_PATH

Where *VIRTUALENV_PATH* is the directory in which the environment is to be created

For a python 3.6 environment::

For a python 3 environment::
virtualenv py_labeler_dev --python=python3.6 VIRTUALENV_PATH

conda create -n magellan_labeler_dev python=3 --file requirements.yml
Once this is done install the dependencies of *py_labeler* using

pip install -r requirements.txt

This will create the new environment, and not touch any of your existing environments,
nor any existing python installation. It will install all of the basic dependencies of
*magellan_labeler*. You need to install the *nose* package which is used for
*py_labeler*. You need to install the *nose* package which is used for
testing, as follows::

conda install -n magellan_labeler_dev nose

To work in this environment, Windows users should ``activate`` it as follows::

activate magellan_labeler_dev
pip install nose

Mac OSX / Linux users should use::
To work in this environment::

source activate magellan_labeler_dev
source VIRTUALENV_PATH/bin/activate

You will then see a confirmation message to indicate you are in the new development environment.

To view your environments::

conda info -e

To return to your home root environment in Windows::
To return to your home root environment::

deactivate

To return to your home root environment in OSX / Linux::

source deactivate

See the full conda docs `here <http://conda.pydata.org/docs>`__.
See the full virtualenv docs `here <https://virtualenv.pypa.io/en/stable/>`__.

Alternatively you could use virtualenvwrapper which is a set of extensions to virtualenv. Refer to the documentation `here <https://virtualenvwrapper.readthedocs.io>`__.

.. _contributing.documentation:

Expand All @@ -206,7 +200,7 @@ Contributing to the documentation

If you're not the developer type, contributing to the documentation is still
of huge value. You don't even have to be an expert on
*magellan_labeler* to do so! Something as simple as rewriting small passages for clarity
*py_labeler* to do so! Something as simple as rewriting small passages for clarity
as you reference the docs is a simple but effective way to contribute. The
next person to read that passage will be in your debt!

Expand All @@ -218,7 +212,7 @@ help the next person.
.. contents:: Documentation:
:local:

About the *magellan_labeler* documentation
About the *py_labeler* documentation
-------------------------------------------

The documentation is written in **reStructuredText**, which is almost like writing
Expand All @@ -229,8 +223,8 @@ complex changes to the documentation as well.

Some other important things to know about the docs:

- The *magellan_labeler* documentation consists of two parts: the docstrings in the code
itself and the docs in this folder ``magellan_labeler/docs/``.
- The *py_labeler* documentation consists of two parts: the docstrings in the code
itself and the docs in this folder ``py_labeler/docs/``.

The docstrings provide a clear explanation of the usage of the individual
functions, while the documentation in this folder consists of tutorial-like
Expand All @@ -244,28 +238,28 @@ Some other important things to know about the docs:
extend it in a similar manner.


How to build the *magellan_labeler* documentation
How to build the *py_labeler* documentation
--------------------------------------------------

Requirements
~~~~~~~~~~~~

To build the *magellan_labeler* docs there are some extra requirements: you will need to
To build the *py_labeler* docs there are some extra requirements: you will need to
have ``sphinx`` and ``ipython`` installed.

It is easiest to :ref:`create a development environment <contributing.dev_env>`, then install::

conda install -n magellan_labeler_dev sphinx ipython
pip install py_labeler_dev sphinx ipython

Building the documentation
~~~~~~~~~~~~~~~~~~~~~~~~~~

So how do you build the docs? Navigate to your local
``magellan_labeler/docs/`` directory in the console and run::
``py_labeler/docs/`` directory in the console and run::

make html

Then you can find the HTML output in the folder ``magellan_labeler/docs/_build/html/``.
Then you can find the HTML output in the folder ``py_labeler/docs/_build/html/``.

If you want to do a full clean build, do::

Expand All @@ -283,35 +277,35 @@ Contributing to the code base

Code standards
--------------
*magellan_labeler* follows `Google Python Style Guide <https://google.github.io/styleguide/pyguide.html>`_.
*py_labeler* follows `Google Python Style Guide <https://google.github.io/styleguide/pyguide.html>`_.

Please try to maintain backward compatibility. *magellan_labeler* has lots of users with lots of
Please try to maintain backward compatibility. *py_labeler* has lots of users with lots of
existing code, so don't break it if at all possible. If you think breakage is required,
clearly state why as part of the pull request. Also, be careful when changing method
signatures and add deprecation warnings where needed.

Writing tests
-------------
Adding tests is one of the most common requests after code is pushed to *magellan_labeler*. Therefore,
Adding tests is one of the most common requests after code is pushed to *py_labeler*. Therefore,
it is worth getting in the habit of writing tests ahead of time so this is never an issue.

Unit testing
~~~~~~~~~~~~
Like many packages, *magellan_labeler* uses the `Nose testing system
Like many packages, *py_labeler* uses the `Nose testing system
<http://nose.readthedocs.org/en/latest/index.html>`_.

All tests should go into the ``tests`` subdirectory of the specific package.
This folder contains many current examples of tests, and we suggest looking to these for
inspiration.

The tests can then be run directly inside your Git clone (without having to
install *magellan_labeler*) by typing::
install *py_labeler*) by typing::

nosetests



Contributing your changes to *magellan_labeler*
Contributing your changes to *py_labeler*
================================================

Committing your code
Expand Down Expand Up @@ -373,12 +367,12 @@ You can see the remote repositories::
If you added the upstream repository as described above you will see something
like::

origin [email protected]:<yourname>/magellan_labeler.git (fetch)
origin [email protected]:<yourname>/magellan_labeler.git (push)
upstream git://github.com/anhaidgroup/magellan_labeler.git (fetch)
upstream git://github.com/anhaidgroup/magellan_labeler.git (push)
origin [email protected]:<yourname>/py_labeler.git (fetch)
origin [email protected]:<yourname>/py_labeler.git (push)
upstream git://github.com/anhaidgroup/py_labeler.git (fetch)
upstream git://github.com/anhaidgroup/py_labeler.git (push)

Now your code is on GitHub, but it is not yet a part of the *magellan_labeler* project. For that to
Now your code is on GitHub, but it is not yet a part of the *py_labeler* project. For that to
happen, a pull request needs to be submitted on GitHub.

Review your code
Expand All @@ -389,7 +383,7 @@ again make sure that you have followed all the guidelines outlined in this docum
regarding code style, tests, performance tests, and documentation. You should also
double check your branch changes against the branch it was based on:

#. Navigate to your repository on GitHub -- https://github.com/<your-user-name>/magellan_labeler
#. Navigate to your repository on GitHub -- https://github.com/<your-user-name>/py_labeler
#. Click on ``Branches``
#. Click on the ``Compare`` button for your feature branch
#. Select the ``base`` and ``compare`` branches, if necessary. This will be ``master`` and
Expand Down
Loading

0 comments on commit ac3351c

Please sign in to comment.