Skip to content

This implementation of Bloom Filter is written in java, and uses a murmur hash. The class is a java generic class that requires a ToBytes object for converting your keys into byte arrays. However if you already have byte array keys that you wish to use them with the bloom filter. There are methods that except and test byte arrays with offsets an…

Notifications You must be signed in to change notification settings

amccurry/bloom-filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Bloom Filter

This implementation of Bloom Filter is written in java, and uses a murmur hash.  The class 
is a java generic class that requires a ToBytes object for converting your keys into byte 
arrays.  However if you already have byte array keys that you wish to use them with the bloom 
filter.  There are methods that except and test byte arrays with offsets and lengths so 
that you can reuse your byte array buffers.

Thread Safety

By default this bloom filter is thread safe through the use of AtomicLongArray.
However if your implementation does not require thread safety you may relax this feature in 
the constructor.  Note: In testing I haven't seen any performance difference with using the 
AtomicLongArray over the standard long array implementation.

Below is a simple example of how to use this class.


import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

import com.nearinfinity.bloomfilter.BloomFilter;
import com.nearinfinity.bloomfilter.ToBytes;

public class Readme {

	public static void main(String[] args) throws IOException, ClassNotFoundException {
		
		BloomFilter<String> bloomFilter = new BloomFilter<String>(0.001, 100, new ToBytes<String>() {
			private static final long serialVersionUID = -2257818636984044019L;
			@Override
			public byte[] toBytes(String key) {
				return key.getBytes();
			}
		});
		
		List<String> knownValues = new ArrayList<String>();
		
		for (int i = 0; i < 100; i++) {
			String key = UUID.randomUUID().toString();
			knownValues.add(key);
			bloomFilter.add(key);
		}
		
		for (String key : knownValues) {
			if (!bloomFilter.test(key)) {
				throw new RuntimeException("False Negative!");
			}
		}
		
		ByteArrayOutputStream baos = new ByteArrayOutputStream();
		ObjectOutputStream outputStream = new ObjectOutputStream(baos);
		outputStream.writeObject(bloomFilter);
		outputStream.close();
		
		ObjectInputStream inputStream = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray()));
		BloomFilter<String> newBloomFilter = (BloomFilter<String>) inputStream.readObject();
		inputStream.close();
		
		for (String key : knownValues) {
			if (!newBloomFilter.test(key)) {
				throw new RuntimeException("False Negative!");
			}
		}
		
		int falsePositive = 0;
		int sampleSize = 100000;
		for (int i = 0; i < sampleSize; i++) {
			if (newBloomFilter.test(UUID.randomUUID().toString())) {
				falsePositive++;
			}
		}
		
		System.out.println("[" + 
				falsePositive +"] false positives out of [" + 
				sampleSize + "] while using [" + 
				newBloomFilter.getMemorySize() + "] bytes of memory");
	}

}

About

This implementation of Bloom Filter is written in java, and uses a murmur hash. The class is a java generic class that requires a ToBytes object for converting your keys into byte arrays. However if you already have byte array keys that you wish to use them with the bloom filter. There are methods that except and test byte arrays with offsets an…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published