Releases: predict-idlab/pyRDF2Vec
Releases · predict-idlab/pyRDF2Vec
pyRDF2Vec 0.2.3
0.2.3 (2021-06-09)
🚀 Features
- Add the
skip_verifyattribute to theKGclass to skip or not the verification of the entity existence with remote Knowledge Graphs (default toskip_verify=False). - Add
WideSampleras a new sampling strategy. - Add
SplitWalkeras a new walking strategy.
Fixed
- Fix the installation dependencies with
poetry. - Fix the cache memory for local Knowledge Graphs.
- Fix validation URL for remote Knowledge Graphs.
- Fix the
HALKWalkerwalking strategy. - Fix the DFS algorithm of
RandomWalkerandCommunityWalkerto return duplicate walks and prevent a different number of walks for the entities. - Fix the walk extraction with the
with_reverseparameter for the different walking strategies.
Added
- Add the
_post_extractprivate method in theWalkerclass for a post processing of walks by a walking strategy.
Changed
- Replace the default minimum frequency thresholds of a hop to keep with
HALKWalker(0.001 -> 0.01). - Drop support for Python 3.7.0
- Remove
negative=20andvector_size=500for Word2Vec.
pyRDF2Vec 0.2.2
0.2.2 (2021-04-02)
🚀 Features
- Add a first support of FastText as embedding technique.
Fixed
- Fix the
sizehyperparameter byvector_sizeof the default dictionary in theWord2Vecclass. - Fix random determinism with walking strategies.
- Fix the calculation of walks for duplicate entities in a file.
- Fix the total recovery of entities, walks, literals and embeddings of a model after multiple online learning.
Added
- Add the
_updateprivate method in theRDF2VecTransformerclass. - Add the
md5_bytesattribute in theCommuniWalker,HALKWalker,RandomWalker, andWLWalkerclasses to hash or not an object in MD5 and with how many bytes to keep.
Changed
- Replace the
extractmethod in theWalkerto returns a list of entities with their walks instead of a list of walks.
pyRDF2Vec 0.2.1
Fix the issue with nest-asyncio as dependency.
pyRDF2Vec 0.2.0
0.2.0 (2021-03-20)
🚀 Features
- Add support for Python 3.9
- Add the
cache(default tocachetools.TTLCache(maxsize=1024, ttl=1200)) attribute to theKGclass to significantly speed up the walks extraction through caching. - Add the
is_update(default toFalse) hyper-parameter in thefitmethod of theEmbedderandWord2Vecclasses to update an existing vocabulary. - Add the
literals(default to[]) attribute in theKGclass to support a basic literal extraction. - Add the
mul_req(default toFalse) attribute to theKGclass to speed up the extraction of walks and literals for remote Knowledge Graph by sending asynchronous requests. - Add the
n_jobs(default toNone) attribute to theWalkerclass to speed up the extraction of walks with multiprocessing. - Add the
random_state(default toNone) parameter for theWalkerclass to handle better random determinism with walking and sampling strategies. - Add the
verbose(default to0) attribute to theRDF2VecTransformerclass to display useful debugging information and to measure the time of extraction, fit and generation of embeddings and literals. - Add the
with_reverse(default toFalse) parameter for theWalkerclass to generate more walks and improve the accuracy withWord2Vec, by including the parents of the entities in the walks. - Add the possibility to do online learning of a model with the
loadand thesavemethods in theRDF2VecTransformerclass. - Add the validators for class parameter attributes.
Added
- Add the
Connectorgeneric class to simplify the implementation of new connectors. - Add the
SPARQLConnectorclass to delegate the connection part to the SPARQL endpoint server. - Add the
Vertexclass in a slot to reduce RAM usage. - Add the
WalkerNotSupportedandSamplerNotSupportedexceptions in theWalkerandSamplerclasses when a walking strategy and a sampling strategy is not supported. - Add the
_cast_literalsprivate method to theKGclass to convert the raw literals of an entity according to their real types. - Add the
_embeddings,_entities,_literals, and_walks, attributes in theRDF2VecTransformerclass to be able to get all the embeddings, entities, literals, and walks after the online training of a model. - Add the
_fill_hopsprivate method in theKGclass to fill the entity hops in cache whenmul_req=Trueis provided for a remote Knowledge Graph. - Add the
_get_hopsprivate method in theKGclass to get the hops of a vertex for a local Knowledge Graph. - Add the
_is_support_remote(default toFalse) private attribute in theWalkerandSamplerclasses to restrict the use of walking and sampling strategies for some remote/local Knowledge Graph. - Add the
_res2hopsprivate method in theKGclass to convert a JSON response from a SPARQL endpoint server to hops. - Add the
add_walkmethod to theKGclass to simplify the addition of walk in a Knowledge Graph. - Add the attr decorator for all classes.
- Add the
examples/online-trainingandexamples/literalsfiles to illustrate the use of online training and literals withpyRDF2Vec. - Add the
fetch_hopsmethod to theKGclass to fetch to get the hops of a vertex on a remote Knowledge Graph. - Add the
get_pliteralsmethod to theKGclass to gets the literals for an entity and a local KG based on a chain of predicates. - Add the
get_walksmethod in theRDF2VecTransformerclass to get the walks of a given entities in a Knowledge Graph. - Add the
get_weightsmethod in theSamplerclass to get the hops weights. - Add the
pyrdf2vec.typingsfile to contains the aliases of the most commonly used typing with mypy.
Fixed
- Fix the
get_weightmethod in thePageRankSamplerto raise an error if the method is called before thefitmethod. - Fix the
remove_edgemethod of theKGclass to also remove the edge of a children for a parent node. - Fix the addition of predicate in memory for remote Knowledge Graphs.
- Fix the initialization of the
_countsdictionary with thePredFreqSamplerandObjPredFreqSamplerclasses.
Changed
- Remove support for Python 3.6
- Remove the
_get_shopsand_get_rhopsfunctions in theKGclass. - Remove the
idattribute of theVertexclass. - Remove the
print_walksmethod of theWalkerclass. - Remove the
read_filemethod in theKGclass. - Remove the
visualisemethod in theKGclass. - Replace the
HalkWalkerclass byHALKWalker. - Replace the
SPARQLWrapperlibrary in favor of usingrequestsfor synchronous requests andaiohttpfor asynchronous requests. - Replace the
WeisfeilerLehmanWalkerclass byWLWalker. - Replaces the
add_edge,add_vertex, andremove_edgemethods in theKGclass to return a boolean value indicating that the addition/removal of an edge/vertex has been performed. - Replace the
depthparameter withmax_depthfor theWalkerclass. - Replace the
extract_random_community_walks,extract_random_community_walks_bfs, andextract_random_community_walks_dfsmethods in theCommunityWalkerclass byextract_walks,_bfs, and_dfsmethods. - Replace the
extract_random_walks,extract_random_walks_bfs, andextract_random_walks_dfsmethods in theRandomWalkerclass byextract_walks,_bfs, and_dfsmethods. - Replace the
file_typeattribute in theKGclass byfmt. - Replace the
get_inv_neighborsmethod in theKGclass by ais_reverse(default toFalse) parameter in theget_neighborsmethod. - Replace the
initializemethod in theSamplerclass by the use of@property. - Replace the
is_remoteparameter in theKGclass for automatic link detection based on the http and https prefix. - Replace the
lastparameter withis_last_depthin thesample_neighbormethod of theSamplerclass. - Replace the
label_predicatesattribute in theKGclass byskip_predicatesand now use a set instead of a list. - Replace the
pyrdf2vec.graphs.kg.Vertexclass withpyrdf2vec.graphs.Vertex. - Replace the
fit_transformandtransformfunctions in theRDF2VecTransformerclass to return a tuple containing the list of embeddings and literals. - Replace the default embedding technique in the
RDF2VecTransformerclass forWord2Vec. - Replace the default hyper-parameters of the
Word2Vecclass tosize=500,min_count=0, andnegative=20. - Replace the default list of walkers in the
RDF2VecTransformerclass to[RandomWalker(2)].
pyRDF2Vec 0.1.1
Removes default prints from rdf2vec.
Fix the README in PyPI.
pyRDF2Vec 0.1.0
Features
- Add a
verbose(default toFalse) hyper-parameter for thefitmethod. - Add basic support for remote Knowledge Graphs through SPARQL endpoint.
- Add configuration for Embedding Techniques through the
Embedderabstract class (currently only Word2Vec is included). - Add online documentation.
- Add sampling strategies (default to
UniformSampler) from Cochez et al. to better deal with larger Knowledge Graphs. - Add static typing for methods.
- Add support for Python 3.6 and 3.7.
- Add the Google Style Python Docstrings.
- Add the
extract_random_walks_dfsandextract_random_walks_bfsmethods for theRamdomWalkerclass. - Add the
get_hopsmethod along with the private_get_rhopsand_get_shopsmethods in theKGclass. - Add three examples (
examples/countries.py,examples/mutag.pyandexamples/samplers.py) forpyRDF2vec.
Changed
- Replace
graphforkgin thefitandfit_transformmethods of theRDF2VecTransformerclass. - Replace
instanceforentitiesin thetransformandfit_transformmethods of theRDF2VecTransformerclass. - Replace default values of hyper-parameters of Word2Vec to match with the default ones of the
gensimimplementation. - Replace the
KnowledgeGraphclass forKG. - Replace the
Walkerclass to be abstract. - Replace the
_rdf2vec.pyfile forrdf2vec.py. - Replace the
extract_random_community_walksmethod in theCommunityWalkerto be private. - Replace the
extractmethods inwalkersto be private. - Replace the
graph.pyfile forgraphs/kg.py. - Replace the
rdf2vecmodule forpyrdf2vec. - Replace the imec licence for an MIT licence.
- Remove
graphhyper-parameter in thetransformmethod of theRDF2VecTransformerclass. - Remove hyper-parameters of
RDF2VecTransformerforembedderandwalkersones. - Remove the
WildcardWalkerwalking strategy. - Remove the
converter.pyfile. - Remove the
create_kg,endpoint_to_kg,rdflib_to_kgfunctions for thelocation,file_type,is_remotehyper-parameters inKGwith theread_fileprivate method. - Replace
Vertex.vertex_countforitertools.countin theVertexclass.