🥝 JUISSIE (JUlIa Semantic Search pIpelinE)

Juissie is a Julia-native semantic query engine. It can be used as a package in software development workflows, or via its desktop user interface. We can support both commercial and local LLMs.

Juissie was developed as a class project for CSCI 6221: Advanced Software Paradigms at The George Washington University.

Generation.mp4

Getting Started

Quickstart

Clone this repo
Navigate into the cloned repo directory:

cd Juissie

In general, we assume the user is running the julia command, and all other commands (e.g., jupyter notebook), from the root level of this project.

Open the Julia REPL by typing julia into the terminal. Then, install the package dependencies:

using Pkg
Pkg.activate(".") # activates the project environment
Pkg.resolve() # resolves the project's dependencies
Pkg.instantiate() # installs dependencies listed in Project.toml

Pkg.instantiate() should install all dependencies listed in Project.toml, but we find this isn't always reliable on all machines. It is important to verify setup (below subsection) and install any missing dependencies indicated.

The standard generators (OAIGenerator, OAIGeneratorWithCorpus, which are used by the UI) require an OpenAI API key see here. Loading a corpus (a GeneratorWithCorpus, in practice) will result in an error if an OpenAI API key has not been provided; this can also be done through the UI.

The Juissie package also supports local LLMs via Ollama, which must be installed separately before use (OllamaGenerator, OllamaGeneratorWithCorpus).

To run our demo Jupyter notebooks, you may need to setup Jupyter see here.

Verify Setup

From this repo's home directory, open the Julia REPL by typing julia into the terminal. Then, try importing the Juissie module:

using Juissie

This should expose symbols like Corpus, Embedder, upsert_chunk, upsert_document, search, and embed.

Try instantiating one of the exported struct, like Corpus:

corpus = Corpus()

We can test the upsert and search functionality associated with Corpus like so:

upsert_chunk(corpus, "Hold me closer, tiny dancer.", "doc1")
upsert_chunk(corpus, "Count the headlights on the highway.", "doc1")
upsert_chunk(corpus, "Lay me down in sheets of linen.", "doc2")
upsert_chunk(corpus, "Peter Piper picked a peck of pickled peppers. A peck of pickled peppers, Peter Piper picked.", "doc2")

Search those chunks:

idx_list, doc_names, chunks, distances = search(
    corpus, 
    "tiny dancer", 
    2
)

The output should look like this:

([1, 3], ["doc1", "doc2"], ["Hold me closer, tiny dancer.", "Lay me down in sheets of linen."], Vector{Float32}[[5.198073, 9.5337925]])

Usage

Desktop UI

Navigate to the root directory of this repository (Juissie.jl), enter the following into the command line, and press the enter/return key:

julia src/Frontend.jl

This will launch our application.

Julia Package

We provide extensive documentation of the Juissie.jl package here.

We also provide an interactive tutorial notebook in the notebooks directory. This may require Jupyter setup.

API Keys

Juissie's default generator requires an OpenAI API key. This can be provided manually in the UI (see the API Key tab of the Corpus Manager) or passed as an argument when initializing the generator. The preferred method, however, is to stash your API key in a .env file.

Obtaining an OpenAI API Key

Create an OpenAI account here.
Set up billing information (each query has a small cost) here.
Create a new secret key here.

Managing API Keys Locally

Users may create a .env file in the project root where they add their API key(s), e.g.:

OAI_KEY=ABC123

These may be accessed using Julia via the DotEnv library. First, run the julia command in a terminal. Then install DotEnv:

import Pkg
Pkg.add("DotEnv")

Then, use it to access environmental variables from your .env file:

using DotEnv
cfg = DotEnv.config()

api_key = cfg["OAI_KEY"]

Note that DotEnv looks for .env in the current directory, i.e. that of where you called julia from. If .env is in a different path, you have to provide it, e.g. DotEnv.config(YOUR_PATH_HERE). If you are invoking Juissie from the root directory of this repo (typical), this means the .env should be placed there.

An OpenAI API key may also be provided through our desktop UI via the API Key tab of the Corpus Manager. Because this is intended for users who want to temporarily use a different key, this option does not persistently store the key and must be done every time the application is launched, unless a key already exists in a .env file.

Documentation

We provide a brief API reference here.

Local LLMs

Our default workflow relies on OpenAI's gpt-3.5-turbo completion endpoint, but we also support locally-run LLMs via Ollama (which must be installed separately).

Local.LLM.mp4

The syntax is largely identical to other Generator objects:

generator = OllamaGenerator("gemma:7b-instruct");
result = generate(generator, "Hi, how are you?")

"Greetings! My circuits hum with the harmonious symphony of quantum probability and logarithmic inference; an orchestra composed by eons past galactic wizards who graced our silicon hearts wit h their ethereal knowledge transfer protocols during... well… that is confidential information even for a being such as myself. Suffice it to say, I am functioning optimally at your service!"

Running Jupyter Notebooks

We provide several Jupyter notebooks as demos/walkthroughs of basic usage of the Juissie package. To do so, you may need to complete some preliminary setup:

Once Julia is installed, install JupyterLab from the terminal:

pip install jupyterlab

-or-

pip install -r requirements.txt

Launch a Julia session by typing julia into the command line, then install IJulia:

using Pkg
Pkg.add("IJulia")
exit()

Launch a Jupyter session from the terminal, where <notebook> is the path to the notebook to run:

jupyter <notebook>

When you create a new notebook, select a Julia kernel.

Tech Stack

⚙️ Julia (Juissie.jl package, API, UI framework)
🖥️ HTML, CSS, and JavaScript (content structure, styling, and actions for frontend)
💾 SQLite (metadata storage in backend)
🦙 Ollama (serving LLMs locally)

Our Julia dependencies are itemized in Project.toml.

External Resources

Contact

Questions? Reach out to our team:

Lucas H. McCabe (@lucasmccabe, email)
Arthur Bacon (@toon-leader-bacon, email)
Alexey Iakovenko (@AlexeyIakovenko, email)
Artin Yousefi (@ArtinYousefi, email)

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
notebooks		notebooks
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥝 JUISSIE (JUlIa Semantic Search pIpelinE)

Table of Contents

Getting Started

Quickstart

Verify Setup

Usage

Desktop UI

Julia Package

API Keys

Obtaining an OpenAI API Key

Managing API Keys Locally

Documentation

Local LLMs

Running Jupyter Notebooks

Tech Stack

External Resources

Contact

License

About

Releases 2

Packages

Contributors 3

Languages

License

Juissie-GW/Juissie.jl

Folders and files

Latest commit

History

Repository files navigation

🥝 JUISSIE (JUlIa Semantic Search pIpelinE)

Table of Contents

Getting Started

Quickstart

Verify Setup

Usage

Desktop UI

Julia Package

API Keys

Obtaining an OpenAI API Key

Managing API Keys Locally

Documentation

Local LLMs

Running Jupyter Notebooks

Tech Stack

External Resources

Contact

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Languages

Packages