release 2.1.0

stephenhky · stephenhky · commit 9b50b627b469 · 2024-12-14T21:30:03.000-05:00
diff --git a/.readthedocs.yml b/.readthedocs.yml
@@ -12,7 +12,7 @@ sphinx:
 build:
   os: ubuntu-22.04
   tools:
-    python: "3.9"
+    python: "3.12"
 
 # Build documentation with MkDocs
 #mkdocs:
diff --git a/README.md b/README.md
@@ -18,7 +18,7 @@ representation of the texts and documents are needed before they are put into
 any classification algorithm. In this package, it facilitates various types
 of these representations, including topic modeling and word-embedding algorithms.
 
-The package `shorttext` runs on Python 3.8, 3.9, 3.10, and 3.11.
+The package `shorttext` runs on Python 3.9, 3.10, 3.11, and 3.12.
 Characteristics:
 
 - example data provided (including subject keywords and NIH RePORT);
@@ -31,8 +31,7 @@ Characteristics:
 - maximum entropy classification;
 - metrics of phrases differences, including soft Jaccard score (using Damerau-Levenshtein distance), and Word Mover's distance (WMD);
 - character-level sequence-to-sequence (seq2seq) learning; 
-- spell correction;
-- API for word-embedding algorithm for one-time loading; and
+- spell correction; and
 - Sentence encodings and similarities based on BERT.
 
 ## Documentation
@@ -84,6 +83,7 @@ If you would like to contribute, feel free to submit the pull requests. You can
 
 ## News
 
+* 12/14/2024: `shorttext` 2.1.0 released.
 * 07/12/2024: `shorttext` 2.0.0 released.
 * 12/21/2023: `shorttext` 1.6.1 released.
 * 08/26/2023: `shorttext` 1.6.0 released.
@@ -159,8 +159,3 @@ If you would like to contribute, feel free to submit the pull requests. You can
 * 12/21/2016: `shorttext` 0.2.0 released.
 * 11/25/2016: `shorttext` 0.1.2 released.
 * 11/21/2016: `shorttext` 0.1.1 released.
-
-## Possible Future Updates
-
-- [ ] Dividing components to other packages;
-- [ ] More available corpus.
diff --git a/docs/codes.rst b/docs/codes.rst
@@ -65,7 +65,13 @@ Module `shorttext.metrics.dynprog`
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 .. automodule:: shorttext.metrics.dynprog.jaccard
-   :members: soft_intersection_list
+   :members:
+
+.. automodule:: shorttext.metrics.dynprog.dldist
+   :members:
+
+.. automodule:: shorttext.metrics.dynprog.lcp
+   :members:
 
 Module `shorttext.metrics.wassersterin`
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/docs/install.rst b/docs/install.rst
@@ -41,7 +41,7 @@ you may try one (or more) of the following:
 
 ::
 
-    pip install -U python3-dev
+    pip install python3-dev
 
 
 
diff --git a/docs/intro.rst b/docs/intro.rst
@@ -23,7 +23,6 @@ Characteristics:
 - metrics of phrases differences, including soft Jaccard score (using Damerau-Levenshtein distance), and Word Mover's distance (WMD); (see :doc:`tutorial_metrics`)
 - character-level sequence-to-sequence (seq2seq) learning; (see :doc:`tutorial_charbaseseq2seq`)
 - spell correction; (see :doc:`tutorial_spell`)
-- API for word-embedding algorithm for one-time loading; (see :doc:`tutorial_wordembedAPI`) and
 - Sentence encodings and similarities based on BERT (see :doc:`tutorial_wordembed` and :doc:`tutorial_metrics`).
 
 Author: Kwan Yuet Stephen Ho (LinkedIn_, ResearchGate_, Twitter_)
diff --git a/docs/news.rst b/docs/news.rst
@@ -1,6 +1,7 @@
 News
 ====
 
+* 12/14/2024: `shorttext` 2.1.0 released.
 * 07/12/2024: `shorttext` 2.0.0 released.
 * 12/21/2023: `shorttext` 1.6.1 released.
 * 08/26/2023: `shorttext` 1.6.0 released.
@@ -81,6 +82,13 @@ News
 What's New
 ----------
 
+Released 2.1.0 (December 14, 2024)
+------------------------------
+
+* Use of `pyproject.toml` for package distribution.
+* Removed Cython components.
+* Huge relative import refactoring.
+
 Released 2.0.0 (July 13, 2024)
 ------------------------------
 
diff --git a/docs/scripts.rst b/docs/scripts.rst
@@ -12,14 +12,15 @@ ShortTextCategorizerConsole
 
     usage: ShortTextCategorizerConsole [-h] [--wv WV] [--vecsize VECSIZE]
                                        [--topn TOPN] [--inputtext INPUTTEXT]
+                                       [--type TYPE]
                                        model_filepath
 
     Perform prediction on short text with a given trained model.
 
     positional arguments:
       model_filepath        Path of the trained (compact) model.
 
-    optional arguments:
+    options:
       -h, --help            show this help message and exit
       --wv WV               Path of the pre-trained Word2Vec model. (None if not
                             needed)
@@ -28,6 +29,9 @@ ShortTextCategorizerConsole
       --inputtext INPUTTEXT
                             single input text for classification. Run console if
                             set to None. (Default: None)
+      --type TYPE           Type of word-embedding model (default: "word2vec";
+                            other options: "fasttext", "poincare",
+                            "word2vec_nonbinary", "poincare_binary")
 
 
 ShortTextWordEmbedSimilarity
diff --git a/docs/tutorial_wordembedAPI.rst b/docs/tutorial_wordembedAPI.rst
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "shorttext"
-version = "2.1.0a1"
+version = "2.1.0"
 authors = [
     {name = "Kwan Yuet Stephen Ho", email = "stephenhky@yahoo.com.hk"}
 ]

Original file line number	Diff line number	Diff line change
`@@ -41,7 +41,7 @@ you may try one (or more) of the following:`
`41`	`41`
`42`	`42`	`::`
`43`	`43`
`44`		`- pip install -U python3-dev`
	`44`	`+ pip install python3-dev`
`45`	`45`
`46`	`46`
`47`	`47`
Original file line number	Diff line number	Diff line change
`@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"`
`4`	`4`
`5`	`5`	`[project]`
`6`	`6`	`name = "shorttext"`
`7`		`-version = "2.1.0a1"`
	`7`	`+version = "2.1.0"`
`8`	`8`	`authors = [`
`9`	`9`	`{name = "Kwan Yuet Stephen Ho", email = "[email protected]"}`
`10`	`10`	`]`