Text processing package

Intro

It basically accepts any html text, but can also be markdown text and analyses the dimensions:

readTime - number of minutes it is required to read this text
keywords - phrases of 1, 2, 3 words that repeat themselves in the text.
vulgarityIndex - you need to scan for vulgar words in english and calculate an index for a story if it is vulgar or not.
nudityIndex - images need to be analysed if they contain adult content
images: need to be parsed from the text into a separate array (ordered by occurance in the text!)
language -> recognise language of the text. It needs to work great for english, japanese spanish and german.
plain - plain version of a text without html tags and images that could be for example sent out in an email
textImageRatio
compressed version of the plain text.

Inputs

Any HTML text

Outputs

{
 readTime: number,
 keywords: {
   1: string[]
   2: string[]
   3: string[]
 }

compressed: string
  nudityIndex: number (0:1)
  vulgarityIndex: number (0:1)
  images: [{ url: string }],
  language: "en" | "de" etc
  textImageRatio: number
  plain: string
}

Interface

interface TextAnalyzer {
  getReadTime: () => Text
  getPlainText: () => Text
  extractImages: () => Images
  analyzeLang: () => Lang
  extractKeywords: (noOfWordsInKeyword) => Keywords  
  analyze: () => TextAnalysis // get complete analysis
}

Install

npm i ath-text-processing-package

Build proccess

This script will build the component:

npm run build

Running

This script will build and run the application.

npm run start

Developers

Licence

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
spec		spec
src		src
.gitignore		.gitignore
.npmignore		.npmignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
gulpfile.js		gulpfile.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text processing package

Intro

Inputs

Outputs

Interface

Install

Build proccess

Running

Developers

Licence

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

alphateamhackers/text-analysis

Folders and files

Latest commit

History

Repository files navigation

Text processing package

Intro

Inputs

Outputs

Interface

Install

Build proccess

Running

Developers

Licence

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages