Skip to content

Commit

Permalink
release 2.1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
stephenhky committed Dec 15, 2024
1 parent d1c5134 commit 9b50b62
Show file tree
Hide file tree
Showing 9 changed files with 26 additions and 54 deletions.
2 changes: 1 addition & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ sphinx:
build:
os: ubuntu-22.04
tools:
python: "3.9"
python: "3.12"

# Build documentation with MkDocs
#mkdocs:
Expand Down
11 changes: 3 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ representation of the texts and documents are needed before they are put into
any classification algorithm. In this package, it facilitates various types
of these representations, including topic modeling and word-embedding algorithms.

The package `shorttext` runs on Python 3.8, 3.9, 3.10, and 3.11.
The package `shorttext` runs on Python 3.9, 3.10, 3.11, and 3.12.
Characteristics:

- example data provided (including subject keywords and NIH RePORT);
Expand All @@ -31,8 +31,7 @@ Characteristics:
- maximum entropy classification;
- metrics of phrases differences, including soft Jaccard score (using Damerau-Levenshtein distance), and Word Mover's distance (WMD);
- character-level sequence-to-sequence (seq2seq) learning;
- spell correction;
- API for word-embedding algorithm for one-time loading; and
- spell correction; and
- Sentence encodings and similarities based on BERT.

## Documentation
Expand Down Expand Up @@ -84,6 +83,7 @@ If you would like to contribute, feel free to submit the pull requests. You can

## News

* 12/14/2024: `shorttext` 2.1.0 released.
* 07/12/2024: `shorttext` 2.0.0 released.
* 12/21/2023: `shorttext` 1.6.1 released.
* 08/26/2023: `shorttext` 1.6.0 released.
Expand Down Expand Up @@ -159,8 +159,3 @@ If you would like to contribute, feel free to submit the pull requests. You can
* 12/21/2016: `shorttext` 0.2.0 released.
* 11/25/2016: `shorttext` 0.1.2 released.
* 11/21/2016: `shorttext` 0.1.1 released.

## Possible Future Updates

- [ ] Dividing components to other packages;
- [ ] More available corpus.
8 changes: 7 additions & 1 deletion docs/codes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,13 @@ Module `shorttext.metrics.dynprog`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: shorttext.metrics.dynprog.jaccard
:members: soft_intersection_list
:members:

.. automodule:: shorttext.metrics.dynprog.dldist
:members:

.. automodule:: shorttext.metrics.dynprog.lcp
:members:

Module `shorttext.metrics.wassersterin`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ you may try one (or more) of the following:

::

pip install -U python3-dev
pip install python3-dev



Expand Down
1 change: 0 additions & 1 deletion docs/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ Characteristics:
- metrics of phrases differences, including soft Jaccard score (using Damerau-Levenshtein distance), and Word Mover's distance (WMD); (see :doc:`tutorial_metrics`)
- character-level sequence-to-sequence (seq2seq) learning; (see :doc:`tutorial_charbaseseq2seq`)
- spell correction; (see :doc:`tutorial_spell`)
- API for word-embedding algorithm for one-time loading; (see :doc:`tutorial_wordembedAPI`) and
- Sentence encodings and similarities based on BERT (see :doc:`tutorial_wordembed` and :doc:`tutorial_metrics`).

Author: Kwan Yuet Stephen Ho (LinkedIn_, ResearchGate_, Twitter_)
Expand Down
8 changes: 8 additions & 0 deletions docs/news.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
News
====

* 12/14/2024: `shorttext` 2.1.0 released.
* 07/12/2024: `shorttext` 2.0.0 released.
* 12/21/2023: `shorttext` 1.6.1 released.
* 08/26/2023: `shorttext` 1.6.0 released.
Expand Down Expand Up @@ -81,6 +82,13 @@ News
What's New
----------

Released 2.1.0 (December 14, 2024)
------------------------------

* Use of `pyproject.toml` for package distribution.
* Removed Cython components.
* Huge relative import refactoring.

Released 2.0.0 (July 13, 2024)
------------------------------

Expand Down
6 changes: 5 additions & 1 deletion docs/scripts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@ ShortTextCategorizerConsole

usage: ShortTextCategorizerConsole [-h] [--wv WV] [--vecsize VECSIZE]
[--topn TOPN] [--inputtext INPUTTEXT]
[--type TYPE]
model_filepath

Perform prediction on short text with a given trained model.

positional arguments:
model_filepath Path of the trained (compact) model.

optional arguments:
options:
-h, --help show this help message and exit
--wv WV Path of the pre-trained Word2Vec model. (None if not
needed)
Expand All @@ -28,6 +29,9 @@ ShortTextCategorizerConsole
--inputtext INPUTTEXT
single input text for classification. Run console if
set to None. (Default: None)
--type TYPE Type of word-embedding model (default: "word2vec";
other options: "fasttext", "poincare",
"word2vec_nonbinary", "poincare_binary")


ShortTextWordEmbedSimilarity
Expand Down
40 changes: 0 additions & 40 deletions docs/tutorial_wordembedAPI.rst

This file was deleted.

2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "shorttext"
version = "2.1.0a1"
version = "2.1.0"
authors = [
{name = "Kwan Yuet Stephen Ho", email = "[email protected]"}
]
Expand Down

0 comments on commit 9b50b62

Please sign in to comment.